[committed] Add myself to write after approval

2024-02-22 Thread Monk Chiang
ChangeLog:

* MAINTAINERS: Add myself.
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 617133447f0..e89833fb83e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -373,6 +373,7 @@ Gabriel Charette

 Chandra Chavva 
 Dehao Chen 
 Fabien Ch??ne  
+Monk Chiang
 Cl??ment Chigot
 Harshit Chopra 
 Tamar Christina

-- 
2.40.1



Re: [PATCH] RISC-V: Fix error combine of pred_mov pattern

2024-02-22 Thread Jeff Law




On 2/19/24 21:21, Alexandre Oliva wrote:

This backport is the second of two required for the pr111935 testcase,
already backported to gcc-13, to pass on riscv64-elf and riscv32-elf.
The V_VLS mode iterator, used in the original patch, is not available in
gcc-13, and I thought that would be too much to backport (and maybe so
are these two patches, WDYT?), so I changed it to V, to match the
preexisting gcc-13 pattern.  Comments also needed manual adjustment.
Regstrapped on x86_64-linux-gnu, along with other backports, and tested
manually on riscv64-elf.  Ok to install?

From: Lehua Ding 

This patch fix PR110943 which will produce some error code. This is because
the error combine of some pred_mov pattern. Consider this code:

```

void foo9 (void *base, void *out, size_t vl)
{
 int64_t scalar = *(int64_t*)(base + 100);
 vint64m2_t v = __riscv_vmv_v_x_i64m2 (0, 1);
 *(vint64m2_t*)out = v;
}
```

RTL before combine pass:

```
(insn 11 10 12 2 (set (reg/v:RVVM2DI 134 [ v ])
 (if_then_else:RVVM2DI (unspec:RVVMF32BI [
 (const_vector:RVVMF32BI repeat [
 (const_int 1 [0x1])
 ])
 (const_int 1 [0x1])
 (const_int 2 [0x2]) repeated x2
 (const_int 0 [0])
 (reg:SI 66 vl)
 (reg:SI 67 vtype)
 ] UNSPEC_VPREDICATE)
 (const_vector:RVVM2DI repeat [
 (const_int 0 [0])
 ])
 (unspec:RVVM2DI [
 (reg:SI 0 zero)
 ] UNSPEC_VUNDEF))) "/app/example.c":6:20 1089 
{pred_movrvvm2di})
(insn 14 13 0 2 (set (mem:RVVM2DI (reg/v/f:DI 136 [ out ]) [1 MEM[(vint64m2_t 
*)out_4(D)]+0 S[32, 32] A128])
 (reg/v:RVVM2DI 134 [ v ])) "/app/example.c":7:23 717 
{*movrvvm2di_whole})
```

RTL after combine pass:
```
(insn 14 13 0 2 (set (mem:RVVM2DI (reg:DI 138) [1 MEM[(vint64m2_t *)out_4(D)]+0 
S[32, 32] A128])
 (if_then_else:RVVM2DI (unspec:RVVMF32BI [
 (const_vector:RVVMF32BI repeat [
 (const_int 1 [0x1])
 ])
 (const_int 1 [0x1])
 (const_int 2 [0x2]) repeated x2
 (const_int 0 [0])
 (reg:SI 66 vl)
 (reg:SI 67 vtype)
 ] UNSPEC_VPREDICATE)
 (const_vector:RVVM2DI repeat [
 (const_int 0 [0])
 ])
 (unspec:RVVM2DI [
 (reg:SI 0 zero)
 ] UNSPEC_VUNDEF))) "/app/example.c":7:23 1089 
{pred_movrvvm2di})
```

This combine change the semantics of insn 14. I split @pred_mov pattern and
restrict the conditon of @pred_mov.

PR target/110943

gcc/ChangeLog:

* config/riscv/predicates.md (vector_const_int_or_double_0_operand):
New predicate.
* config/riscv/riscv-vector-builtins.cc 
(function_expander::function_expander):
force_reg mem target operand.
* config/riscv/vector.md (@pred_mov): Wrapper.
(*pred_mov): Remove imm -> reg pattern.
(*pred_broadcast_imm): Add imm -> reg pattern.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr110943.c: New test.

I'd leave this alone as well.  I just don't see much value in the backports.

jeff



Re: [PATCH] RISC-V: Revert the convert from vmv.s.x to vmv.v.i

2024-02-22 Thread Jeff Law




On 2/19/24 21:15, Alexandre Oliva wrote:

This backport is the first of two required for the pr111935 testcase,
already backported to gcc-13, to pass on riscv64-elf and riscv32-elf.
The V_VLS mode iterator, used in the original patch, is not available in
gcc-13, and I thought that would be too much to backport (and maybe so
are these two patches, WDYT?), so I changed it to V, to match the
preexisting gcc-13 pattern.  Regstrapped on x86_64-linux-gnu, along with
other backports, and tested manually on riscv64-elf.  Ok to install?

From: Lehua Ding 

Hi,

This patch revert the convert from vmv.s.x to vmv.v.i and add new pattern
optimize the special case when the scalar operand is zero.

Currently, the broadcast pattern where the scalar operand is a imm
will be converted to vmv.v.i from vmv.s.x and the mask operand will be
converted from 00..01 to 11..11. There are some advantages and
disadvantages before and after the conversion after discussing
with Juzhe offline and we chose not to do this transform.

Before:

   Advantages: The vsetvli info required by vmv.s.x has better compatibility 
since
   vmv.s.x only required SEW and VLEN be zero or one. That mean there
   is more opportunities to combine with other vsetlv infos in vsetvl pass.

   Disadvantages: For non-zero scalar imm, one more `li rd, imm` instruction
   will be needed.

After:

   Advantages: No need `li rd, imm` instruction since vmv.v.i support imm 
operand.

   Disadvantages: Like before's advantages. Worse compatibility leads to more
   vsetvl instrunctions need.

Consider the bellow C code and asm after autovec.
there is an extra insn (vsetivli zero, 1, e32, m1, ta, ma)
after converted vmv.s.x to vmv.v.i.

```
int foo1(int* restrict a, int* restrict b, int *restrict c, int n) {
 int sum = 0;
 for (int i = 0; i < n; i++)
   sum += a[i] * b[i];

 return sum;
}
```

asm (Before):

```
foo1:
 ble a3,zero,.L7
 vsetvli a2,zero,e32,m1,ta,ma
 vmv.v.i v1,0
.L6:
 vsetvli a5,a3,e32,m1,tu,ma
 sllia4,a5,2
 sub a3,a3,a5
 vle32.v v2,0(a0)
 vle32.v v3,0(a1)
 add a0,a0,a4
 add a1,a1,a4
 vmacc.vvv1,v3,v2
 bne a3,zero,.L6
 vsetvli a2,zero,e32,m1,ta,ma
 vmv.s.x v2,zero
 vredsum.vs  v1,v1,v2
 vmv.x.s a0,v1
 ret
.L7:
 li  a0,0
 ret
```

asm (After):

```
foo1:
 ble a3,zero,.L4
 vsetvli a2,zero,e32,m1,ta,ma
 vmv.v.i v1,0
.L3:
 vsetvli a5,a3,e32,m1,tu,ma
 sllia4,a5,2
 sub a3,a3,a5
 vle32.v v2,0(a0)
 vle32.v v3,0(a1)
 add a0,a0,a4
 add a1,a1,a4
 vmacc.vvv1,v3,v2
 bne a3,zero,.L3
 vsetivlizero,1,e32,m1,ta,ma
 vmv.v.i v2,0
 vsetvli a2,zero,e32,m1,ta,ma
 vredsum.vs  v1,v1,v2
 vmv.x.s a0,v1
 ret
.L4:
 li  a0,0
 ret
```

Best,
Lehua

Co-Authored-By: Ju-Zhe Zhong 

gcc/ChangeLog:

* config/riscv/predicates.md (vector_const_0_operand): New.
* config/riscv/vector.md (*pred_broadcast_zero): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/scalar_move-5.c: Update.
* gcc.target/riscv/rvv/base/scalar_move-6.c: Ditto.
I wouldn't backport this.  Vector isn't something that's really expected 
to work with gcc-13.  Yea, you can do a bit of intrinsics, but that's it.


Jeff



Re: [PATCH] RISC-V: Fix CTZ unnecessary sign extension [PR #106888]

2024-02-22 Thread Jeff Law




On 2/20/24 07:21, Alexandre Oliva wrote:

On Feb 20, 2024, Jeff Law  wrote:


On 2/19/24 21:26, Alexandre Oliva wrote:

This backport for gcc-13 is required for pr90838.c to get the expected
count of andi instructions on riscv64-elf

.

In general, shouldn't backports be focused on correctness issues?


*nod*.


It's unclear what the motivation is for backporting this change into
gcc-13.


There's this unexpected fail in gcc-13 (pr90838.c), one out of a handful
that we've hit while transitioning our riscv toolchains to gcc-13.

I set out to understand them, I identified the patches that got them to
pass in the trunk, and so I've proposed their backports to fix the fails
in gcc-13.

Surely there are other ways to address each one of the fails.

But even if we choose to just xfail them, or leave them failing noisily,
I've already gone through the process of identifying the fix, so I
figured I might as well share it.
Thanks for explaining things.  I had a feeling the motivation might be 
something along those lines.


I'd tend to think we don't want this backported.  It doesn't fix any 
correctness issue and the performance impact is small.  I also don't 
expect gcc-13 is going to be of significant long term interest in the 
RISC-V space as it predates any RVV support.


So this feels like it ought to be left as-is on the gcc-13 branch.

jeff


Re: [PATCH v3] LoongArch: Split loongarch_option_override_internal into smaller procedures

2024-02-22 Thread Yang Yujie
v1 -> v2:
- Rebased to master.
- Specifies "(void)" for the empty parameter list of loongarch_global_init.

v2 -> v3:
- Keep the original option-processing behavior (-march=la664 enables 
-mrecip=all)
  and fix the ICE when -mfrecipe is not passed with -mrecip.



[PATCH v3] LoongArch: Split loongarch_option_override_internal into smaller procedures

2024-02-22 Thread Yang Yujie
gcc/ChangeLog:

* config/loongarch/genopts/loongarch.opt.in: Mark -m[no-]recip as
aliases to -mrecip={all,none}.
* config/loongarch/loongarch.opt: Same.
* config/loongarch/loongarch-def.h: Modify ABI condition macros for
convenience.
* config/loongarch/loongarch-opts.cc: Define option-handling
procedures split from the original loongarch_option_override_internal.
* config/loongarch/loongarch-opts.h: Same.
* config/loongarch/loongarch.cc: Clean up
loongarch_option_override_internal.
---
 gcc/config/loongarch/genopts/loongarch.opt.in |   8 +-
 gcc/config/loongarch/loongarch-def.h  |  11 +-
 gcc/config/loongarch/loongarch-opts.cc| 253 ++
 gcc/config/loongarch/loongarch-opts.h |  27 +-
 gcc/config/loongarch/loongarch.cc | 253 +++---
 gcc/config/loongarch/loongarch.h  |  18 +-
 gcc/config/loongarch/loongarch.opt|   8 +-
 7 files changed, 342 insertions(+), 236 deletions(-)

diff --git a/gcc/config/loongarch/genopts/loongarch.opt.in 
b/gcc/config/loongarch/genopts/loongarch.opt.in
index 02f918053f5..a77893d31d9 100644
--- a/gcc/config/loongarch/genopts/loongarch.opt.in
+++ b/gcc/config/loongarch/genopts/loongarch.opt.in
@@ -197,14 +197,14 @@ mexplicit-relocs
 Target Alias(mexplicit-relocs=, always, none)
 Use %reloc() assembly operators (for backward compatibility).
 
-mrecip
-Target RejectNegative Var(la_recip) Save
-Generate approximate reciprocal divide and square root for better throughput.
-
 mrecip=
 Target RejectNegative Joined Var(la_recip_name) Save
 Control generation of reciprocal estimates.
 
+mrecip
+Target Alias(mrecip=, all, none)
+Generate approximate reciprocal divide and square root for better throughput.
+
 ; The code model option names for -mcmodel.
 Enum
 Name(cmodel) Type(int)
diff --git a/gcc/config/loongarch/loongarch-def.h 
b/gcc/config/loongarch/loongarch-def.h
index 2dbf006d013..0cbf9476690 100644
--- a/gcc/config/loongarch/loongarch-def.h
+++ b/gcc/config/loongarch/loongarch-def.h
@@ -90,11 +90,16 @@ extern loongarch_def_array
 
 #define TO_LP64_ABI_BASE(C) (C)
 
-#define ABI_FPU_64(abi_base) \
+#define ABI_LP64_P(abi_base) \
+  (abi_base == ABI_BASE_LP64D \
+   || abi_base == ABI_BASE_LP64F \
+   || abi_base == ABI_BASE_LP64S)
+
+#define ABI_FPU64_P(abi_base) \
   (abi_base == ABI_BASE_LP64D)
-#define ABI_FPU_32(abi_base) \
+#define ABI_FPU32_P(abi_base) \
   (abi_base == ABI_BASE_LP64F)
-#define ABI_FPU_NONE(abi_base) \
+#define ABI_NOFPU_P(abi_base) \
   (abi_base == ABI_BASE_LP64S)
 
 
diff --git a/gcc/config/loongarch/loongarch-opts.cc 
b/gcc/config/loongarch/loongarch-opts.cc
index 7eeac43ed2f..e5f27b8716f 100644
--- a/gcc/config/loongarch/loongarch-opts.cc
+++ b/gcc/config/loongarch/loongarch-opts.cc
@@ -25,6 +25,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "coretypes.h"
 #include "tm.h"
 #include "obstack.h"
+#include "opts.h"
 #include "diagnostic-core.h"
 
 #include "loongarch-cpu.h"
@@ -32,8 +33,12 @@ along with GCC; see the file COPYING3.  If not see
 #include "loongarch-str.h"
 #include "loongarch-def.h"
 
+/* Target configuration */
 struct loongarch_target la_target;
 
+/* RTL cost information */
+const struct loongarch_rtx_cost_data *loongarch_cost;
+
 /* ABI-related configuration.  */
 #define ABI_COUNT (sizeof(abi_priority_list)/sizeof(struct loongarch_abi))
 static const struct loongarch_abi
@@ -795,3 +800,251 @@ loongarch_update_gcc_opt_status (struct loongarch_target 
*target,
   /* ISA evolution features */
   opts->x_la_isa_evolution = target->isa.evolution;
 }
+
+/* -mrecip= handling */
+static struct
+  {
+const char *string;/* option name.  */
+unsigned int mask; /* mask bits to set.  */
+  }
+const recip_options[] = {
+  { "all",   RECIP_MASK_ALL },
+  { "none",  RECIP_MASK_NONE },
+  { "div",   RECIP_MASK_DIV },
+  { "sqrt",  RECIP_MASK_SQRT },
+  { "rsqrt", RECIP_MASK_RSQRT },
+  { "vec-div",   RECIP_MASK_VEC_DIV },
+  { "vec-sqrt",  RECIP_MASK_VEC_SQRT },
+  { "vec-rsqrt", RECIP_MASK_VEC_RSQRT },
+};
+
+/* Parser for -mrecip=.  */
+unsigned int
+loongarch_parse_mrecip_scheme (const char *recip_string)
+{
+  unsigned int result_mask = RECIP_MASK_NONE;
+
+  if (recip_string)
+{
+  char *p = ASTRDUP (recip_string);
+  char *q;
+  unsigned int mask, i;
+  bool invert;
+
+  while ((q = strtok (p, ",")) != NULL)
+   {
+ p = NULL;
+ if (*q == '!')
+   {
+ invert = true;
+ q++;
+   }
+ else
+   invert = false;
+
+ if (!strcmp (q, "default"))
+   mask = RECIP_MASK_ALL;
+ else
+   {
+ for (i = 0; i < ARRAY_SIZE (recip_options); i++)
+   if (!strcmp (q, recip_options[i].string))
+ {
+   mask = recip_options[i].mask;
+

Re: [PATCH] RISC-V: Point our Python scripts at python3

2024-02-22 Thread Kito Cheng
I guess Palmer is too busy, so committed to trunk :P

On Tue, Feb 13, 2024 at 11:55 PM Jeff Law  wrote:
>
>
>
> On 2/9/24 09:53, Palmer Dabbelt wrote:
> > This builds for me, and I frequently have python-is-python3 type
> > packages installed so I think I've been implicitly testing it for a
> > while.  Looks like Kito's tested similar configurations, and the
> > bugzilla indicates we should be moving over.
> >
> > gcc/ChangeLog:
> >
> >   PR 109668
> >   * config/riscv/arch-canonicalize: Move to python3
> >   * config/riscv/multilib-generator: Likewise
> Just to summarize from the coordination call this morning.  We've agreed
> this should go forward.  While there is minor risk (this code is rarely
> run), it's something we're prepared to handle if there is fallout.
>
> Jeff


Re: [PATCH] doc: RISC-V: Document that -mcpu doesn't override -march or -mtune

2024-02-22 Thread Kito Cheng
LGTM, and committed :)

On Tue, Feb 20, 2024 at 11:46 PM Palmer Dabbelt  wrote:
>
> This came up recently as Edwin was looking through the test suite.  A
> few of us were talking about this during the patchwork meeting and were
> surprised.  Looks like this is the desired behavior, so let's at least
> document it.
>
> gcc/ChangeLog:
>
> * doc/invoke.texi: Document -mcpu.
>
> Signed-off-by: Palmer Dabbelt 
> ---
>  gcc/doc/invoke.texi | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 6ec56493e59..4a4bba9f1cd 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -30670,6 +30670,8 @@ Permissible values for this option are: 
> @samp{sifive-e20}, @samp{sifive-e21},
>  @samp{sifive-s21}, @samp{sifive-s51}, @samp{sifive-s54}, @samp{sifive-s76},
>  @samp{sifive-u54}, @samp{sifive-u74}, and @samp{sifive-x280}.
>
> +Note that @option{-mcpu} does not override @option{-march} or 
> @option{-mtune}.
> +
>  @opindex mtune
>  @item -mtune=@var{processor-string}
>  Optimize the output for the given processor, specified by microarchitecture 
> or
> --
> 2.43.0
>


Re: [PATCH] LoongArch: Don't falsely claim gold supported in toplevel configure

2024-02-22 Thread chenglulu



在 2024/2/23 上午11:27, Xi Ruoyao 写道:

On Fri, 2024-02-23 at 11:16 +0800, chenglulu wrote:

在 2024/2/22 下午5:17, Xi Ruoyao 写道:

The gold linker has never been ported to LoongArch (and it seems
unlikely to be ported in the future as the new architectures are
focusing on lld and/or mold for fast linkers).

ChangeLog:

    * configure.ac (ENABLE_GOLD): Remove loongarch*-*-* from target
    list.
    * configure: Regenerate.
---

Ok for GCC trunk (to get synced into Binutils later)?

I have no problem. But I have a question. Is this modification simply
because we don’t

support it or is there an error somewhere?

If a user specify --enable-gold building Binutils, with loongarch in
this list the building system will attempt to build gold and fail.  If
removing loongarch from the list the building system will ignore --
enable-gold.


Okay, I understand.

Thanks!:-)



Re: [PATCH] LoongArch: Don't falsely claim gold supported in toplevel configure

2024-02-22 Thread Xi Ruoyao
On Fri, 2024-02-23 at 11:16 +0800, chenglulu wrote:
> 
> 在 2024/2/22 下午5:17, Xi Ruoyao 写道:
> > The gold linker has never been ported to LoongArch (and it seems
> > unlikely to be ported in the future as the new architectures are
> > focusing on lld and/or mold for fast linkers).
> > 
> > ChangeLog:
> > 
> >     * configure.ac (ENABLE_GOLD): Remove loongarch*-*-* from target
> >     list.
> >     * configure: Regenerate.
> > ---
> > 
> > Ok for GCC trunk (to get synced into Binutils later)?
> 
> I have no problem. But I have a question. Is this modification simply 
> because we don’t
> 
> support it or is there an error somewhere?

If a user specify --enable-gold building Binutils, with loongarch in
this list the building system will attempt to build gold and fail.  If
removing loongarch from the list the building system will ignore --
enable-gold.

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: [pushed][PATCH v1] LoongArch: When checking whether the assembler supports conditional branch relaxation, add compilation parameter "--fatal-warnings" to the assembler.

2024-02-22 Thread chenglulu

Pushed to r14-9142.

在 2024/2/21 上午11:30, Lulu Cheng 写道:

In binutils 2.40 and earlier versions, only a warning will be reported
when a relocation immediate value is out of bounds. As a result,
the value of the macro HAVE_AS_COND_BRANCH_RELAXATION will also be
defined as 1 when the assembler does not support conditional branch
relaxation. Therefore, add the compilation option "--fatal-warnings"
to avoid this problem.

gcc/ChangeLog:

* configure: Regenerate.
* configure.ac: Add parameter "--fatal-warnings" to assemble
when checking whether the assemble support conditional branch
relaxation.
---
  gcc/configure| 2 +-
  gcc/configure.ac | 2 +-
  2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/configure b/gcc/configure
index 41b978b0380..f1d434fede0 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -31136,7 +31136,7 @@ else
 nop
 .endr
 beq $a0,$a1,a' > conftest.s
-if { ac_try='$gcc_cv_as $gcc_cv_as_flags  -o conftest.o conftest.s >&5'
+if { ac_try='$gcc_cv_as $gcc_cv_as_flags --fatal-warnings -o conftest.o 
conftest.s >&5'
{ { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5
(eval $ac_try) 2>&5
ac_status=$?
diff --git a/gcc/configure.ac b/gcc/configure.ac
index 72012d61e67..9ebc578e4cc 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -5486,7 +5486,7 @@ x:
[Define if your assembler supports -mrelax option.])])
  gcc_GAS_CHECK_FEATURE([conditional branch relaxation support],
gcc_cv_as_loongarch_cond_branch_relax,
-  [],
+  [--fatal-warnings],
[a:
 .rept 32769
 nop




Re: [PATCH] LoongArch: Don't falsely claim gold supported in toplevel configure

2024-02-22 Thread chenglulu



在 2024/2/22 下午5:17, Xi Ruoyao 写道:

The gold linker has never been ported to LoongArch (and it seems
unlikely to be ported in the future as the new architectures are
focusing on lld and/or mold for fast linkers).

ChangeLog:

* configure.ac (ENABLE_GOLD): Remove loongarch*-*-* from target
list.
* configure: Regenerate.
---

Ok for GCC trunk (to get synced into Binutils later)?


I have no problem. But I have a question. Is this modification simply 
because we don’t


support it or is there an error somewhere?



  configure| 2 +-
  configure.ac | 2 +-
  2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/configure b/configure
index 874966fb9f0..02b435c1163 100755
--- a/configure
+++ b/configure
@@ -3092,7 +3092,7 @@ case "${ENABLE_GOLD}" in
# Check for target supported by gold.
case "${target}" in
  i?86-*-* | x86_64-*-* | sparc*-*-* | powerpc*-*-* | arm*-*-* \
-| aarch64*-*-* | tilegx*-*-* | mips*-*-* | s390*-*-* | loongarch*-*-*)
+| aarch64*-*-* | tilegx*-*-* | mips*-*-* | s390*-*-*)
  configdirs="$configdirs gold"
  if test x${ENABLE_GOLD} = xdefault; then
default_ld=gold
diff --git a/configure.ac b/configure.ac
index 4f34004a072..1a19c07a27b 100644
--- a/configure.ac
+++ b/configure.ac
@@ -364,7 +364,7 @@ case "${ENABLE_GOLD}" in
# Check for target supported by gold.
case "${target}" in
  i?86-*-* | x86_64-*-* | sparc*-*-* | powerpc*-*-* | arm*-*-* \
-| aarch64*-*-* | tilegx*-*-* | mips*-*-* | s390*-*-* | loongarch*-*-*)
+| aarch64*-*-* | tilegx*-*-* | mips*-*-* | s390*-*-*)
  configdirs="$configdirs gold"
  if test x${ENABLE_GOLD} = xdefault; then
default_ld=gold




Re: [PATCH] RISC-V: Fix vec_init for simple sequences [PR114028].

2024-02-22 Thread juzhe.zh...@rivai.ai
Sorry, I missed review the testcase:

+/* { dg-final { scan-assembler-times "vmv\.v\.i\tv\[0-9\],0" 0 } } */

I think you should use "scan-assembler-not"



juzhe.zh...@rivai.ai
 
From: Robin Dapp
Date: 2024-02-23 04:02
To: gcc-patches; palmer; Kito Cheng; juzhe.zh...@rivai.ai
CC: rdapp.gcc; jeffreyalaw
Subject: [PATCH] RISC-V: Fix vec_init for simple sequences [PR114028].
Hi,
 
for a vec_init (_a, _a, _a, _a) with _a of mode DImode we try to
construct a "superword" of two "_a"s.  This only works for modes < Pmode
when we can "shift and or" two halves into one Pmode register.
This patch disallows the optimization for inner_mode == Pmode and emits
a simple broadcast in such a case.
 
The test is not a run test because it requires vlen=256 in qemu.
I can adjust that still of course.
 
Regtested on rv64, rv32 still running.
 
Regards
Robin
 
gcc/ChangeLog:
 
PR target/114028
 
* config/riscv/riscv-v.cc (rvv_builder::can_duplicate_repeating_sequence_p):
Return false if inner mode is already Pmode.
(rvv_builder::is_all_same_sequence): New function.
(expand_vec_init): Emit broadcast if sequence is all same.
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/autovec/pr114028.c: New test.
---
gcc/config/riscv/riscv-v.cc   | 25 ++-
.../gcc.target/riscv/rvv/autovec/pr114028.c   | 25 +++
2 files changed, 49 insertions(+), 1 deletion(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr114028.c
 
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 0cfbd21ce6f..29d58deb995 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -443,6 +443,7 @@ public:
   }
   bool can_duplicate_repeating_sequence_p ();
+  bool is_repeating_sequence ();
   rtx get_merged_repeating_sequence ();
   bool repeating_sequence_use_merge_profitable_p ();
@@ -483,7 +484,8 @@ rvv_builder::can_duplicate_repeating_sequence_p ()
{
   poly_uint64 new_size = exact_div (full_nelts (), npatterns ());
   unsigned int new_inner_size = m_inner_bits_size * npatterns ();
-  if (!int_mode_for_size (new_inner_size, 0).exists (&m_new_inner_mode)
+  if (m_inner_mode == Pmode
+  || !int_mode_for_size (new_inner_size, 0).exists (&m_new_inner_mode)
   || GET_MODE_SIZE (m_new_inner_mode) > UNITS_PER_WORD
   || !get_vector_mode (m_new_inner_mode, new_size).exists (&m_new_mode))
 return false;
@@ -492,6 +494,18 @@ rvv_builder::can_duplicate_repeating_sequence_p ()
   return nelts_per_pattern () == 1;
}
+/* Return true if the vector is a simple sequence with one pattern and all
+   elements the same.  */
+bool
+rvv_builder::is_repeating_sequence ()
+{
+  if (npatterns () > 1)
+return false;
+  if (full_nelts ().is_constant ())
+return repeating_sequence_p (0, full_nelts ().to_constant (), 1);
+  return nelts_per_pattern () == 1;
+}
+
/* Return true if it is a repeating sequence that using
merge approach has better codegen than using default
approach (slide1down).
@@ -2544,6 +2558,15 @@ expand_vec_init (rtx target, rtx vals)
 v.quick_push (XVECEXP (vals, 0, i));
   v.finalize ();
+  /* If the sequence is v = { a, a, a, a } just broadcast an element.  */
+  if (v.is_repeating_sequence ())
+{
+  machine_mode mode = GET_MODE (target);
+  rtx dup = expand_vector_broadcast (mode, v.elt (0));
+  emit_move_insn (target, dup);
+  return;
+}
+
   if (nelts > 3)
 {
   /* Case 1: Convert v = { a, b, a, b } into v = { ab, ab }.  */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr114028.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr114028.c
new file mode 100644
index 000..a451d85e3fe
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr114028.c
@@ -0,0 +1,25 @@
+/* { dg-do compile }  */
+/* { dg-options "-march=rv64gcv_zvl256b -O3" } */
+
+int a, d = 55003;
+long c = 0, h;
+long e = 1;
+short i;
+
+int
+main ()
+{
+  for (int g = 0; g < 16; g++)
+{
+  d |= c;
+  short l = d;
+  i = l < 0 || a >> 4 ? d : a;
+  h = i - 8L;
+  e &= h;
+}
+
+  if (e != 1)
+__builtin_abort ();
+}
+
+/* { dg-final { scan-assembler-times "vmv\.v\.i\tv\[0-9\],0" 0 } } */
-- 
2.43.2
 


Re: PING: [PATCH] x86-64: Check R_X86_64_CODE_6_GOTTPOFF support

2024-02-22 Thread H.J. Lu
On Thu, Feb 22, 2024 at 6:39 PM Hongtao Liu  wrote:
>
> On Thu, Feb 22, 2024 at 10:33 PM H.J. Lu  wrote:
> >
> > On Sun, Feb 18, 2024 at 8:02 AM H.J. Lu  wrote:
> > >
> > > If assembler and linker supports
> > >
> > > add %reg1, name@gottpoff(%rip), %reg2
> > >
> > > with R_X86_64_CODE_6_GOTTPOFF, we can generate it instead of
> > >
> > > mov name@gottpoff(%rip), %reg2
> > > add %reg1, %reg2
> x86 part LGTM, but I'm not familiar with the changes in config related files.

Jakub, Uros, Alexandre, can you review the configure.ac change in this patch?

https://patchwork.sourceware.org/project/gcc/list/?series=31075

Thanks.

> > >
> > > gcc/
> > >
> > > * configure.ac (HAVE_AS_R_X86_64_CODE_6_GOTTPOFF): Defined as 1
> > > if R_X86_64_CODE_6_GOTTPOFF is supported.
> > > * config.in: Regenerated.
> > > * configure: Likewise.
> > > * config/i386/predicates.md (apx_ndd_add_memory_operand): Allow
> > > UNSPEC_GOTNTPOFF if R_X86_64_CODE_6_GOTTPOFF is supported.
> > >
> > > gcc/testsuite/
> > >
> > > * gcc.target/i386/apx-ndd-tls-1b.c: New test.
> > > * lib/target-supports.exp
> > > (check_effective_target_code_6_gottpoff_reloc): New.
> > > ---
> > >  gcc/config.in |  7 +++
> > >  gcc/config/i386/predicates.md |  6 +-
> > >  gcc/configure | 62 +++
> > >  gcc/configure.ac  | 37 +++
> > >  .../gcc.target/i386/apx-ndd-tls-1b.c  |  9 +++
> > >  gcc/testsuite/lib/target-supports.exp | 48 ++
> > >  6 files changed, 168 insertions(+), 1 deletion(-)
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/apx-ndd-tls-1b.c
> > >
> > > diff --git a/gcc/config.in b/gcc/config.in
> > > index ce1d073833f..f3de4ba6776 100644
> > > --- a/gcc/config.in
> > > +++ b/gcc/config.in
> > > @@ -737,6 +737,13 @@
> > >  #endif
> > >
> > >
> > > +/* Define 0/1 if your assembler and linker support 
> > > R_X86_64_CODE_6_GOTTPOFF.
> > > +   */
> > > +#ifndef USED_FOR_TARGET
> > > +#undef HAVE_AS_R_X86_64_CODE_6_GOTTPOFF
> > > +#endif
> > > +
> > > +
> > >  /* Define if your assembler supports relocs needed by -fpic. */
> > >  #ifndef USED_FOR_TARGET
> > >  #undef HAVE_AS_SMALL_PIC_RELOCS
> > > diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
> > > index 4c1aedd7e70..391f108c360 100644
> > > --- a/gcc/config/i386/predicates.md
> > > +++ b/gcc/config/i386/predicates.md
> > > @@ -2299,10 +2299,14 @@ (define_predicate "apx_ndd_memory_operand"
> > >
> > >  ;; Return true if OP is a memory operand which can be used in APX NDD
> > >  ;; ADD with register source operand.  UNSPEC_GOTNTPOFF memory operand
> > > -;; isn't allowed with APX NDD ADD.
> > > +;; is allowed with APX NDD ADD only if R_X86_64_CODE_6_GOTTPOFF works.
> > >  (define_predicate "apx_ndd_add_memory_operand"
> > >(match_operand 0 "memory_operand")
> > >  {
> > > +  /* OK if "add %reg1, name@gottpoff(%rip), %reg2" is supported.  */
> > > +  if (HAVE_AS_R_X86_64_CODE_6_GOTTPOFF)
> > > +return true;
> > > +
> > >op = XEXP (op, 0);
> > >
> > >/* Disallow APX NDD ADD with UNSPEC_GOTNTPOFF.  */
> > > diff --git a/gcc/configure b/gcc/configure
> > > index 41b978b0380..c59c971862c 100755
> > > --- a/gcc/configure
> > > +++ b/gcc/configure
> > > @@ -29834,6 +29834,68 @@ cat >>confdefs.h <<_ACEOF
> > >  _ACEOF
> > >
> > >
> > > +if echo "$ld_ver" | grep GNU > /dev/null; then
> > > +  if $gcc_cv_ld -V 2>/dev/null | grep elf_x86_64_sol2 > /dev/null; 
> > > then
> > > +ld_ix86_gld_64_opt="-melf_x86_64_sol2"
> > > +  else
> > > +ld_ix86_gld_64_opt="-melf_x86_64"
> > > +  fi
> > > +fi
> > > +conftest_s='
> > > +   .text
> > > +   .globl  _start
> > > +   .type _start, @function
> > > +_start:
> > > +   addq%r23,foo@GOTTPOFF(%rip), %r15
> > > +   .section .tdata,"awT",@progbits
> > > +   .type foo, @object
> > > +foo:
> > > +   .quad 0'
> > > +{ $as_echo "$as_me:${as_lineno-$LINENO}: checking assembler for 
> > > R_X86_64_CODE_6_GOTTPOFF reloc" >&5
> > > +$as_echo_n "checking assembler for R_X86_64_CODE_6_GOTTPOFF reloc... " 
> > > >&6; }
> > > +if ${gcc_cv_as_x86_64_code_6_gottpoff+:} false; then :
> > > +  $as_echo_n "(cached) " >&6
> > > +else
> > > +  gcc_cv_as_x86_64_code_6_gottpoff=no
> > > +  if test x$gcc_cv_as != x; then
> > > +$as_echo "$conftest_s" > conftest.s
> > > +if { ac_try='$gcc_cv_as $gcc_cv_as_flags  -o conftest.o conftest.s 
> > > >&5'
> > > +  { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5
> > > +  (eval $ac_try) 2>&5
> > > +  ac_status=$?
> > > +  $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
> > > +  test $ac_status = 0; }; }
> > > +then
> > > +   if test x$gcc_cv_ld != x && test x$gcc_cv_objdump != x \
> > > +   && test x$gcc_cv_readelf != x \
> > > +   && $gcc_cv_rea

Re: [PATCH] RISC-V: Fix vec_init for simple sequences [PR114028].

2024-02-22 Thread juzhe.zh...@rivai.ai
lgtm.



juzhe.zh...@rivai.ai
 
From: Robin Dapp
Date: 2024-02-23 04:02
To: gcc-patches; palmer; Kito Cheng; juzhe.zh...@rivai.ai
CC: rdapp.gcc; jeffreyalaw
Subject: [PATCH] RISC-V: Fix vec_init for simple sequences [PR114028].
Hi,
 
for a vec_init (_a, _a, _a, _a) with _a of mode DImode we try to
construct a "superword" of two "_a"s.  This only works for modes < Pmode
when we can "shift and or" two halves into one Pmode register.
This patch disallows the optimization for inner_mode == Pmode and emits
a simple broadcast in such a case.
 
The test is not a run test because it requires vlen=256 in qemu.
I can adjust that still of course.
 
Regtested on rv64, rv32 still running.
 
Regards
Robin
 
gcc/ChangeLog:
 
PR target/114028
 
* config/riscv/riscv-v.cc (rvv_builder::can_duplicate_repeating_sequence_p):
Return false if inner mode is already Pmode.
(rvv_builder::is_all_same_sequence): New function.
(expand_vec_init): Emit broadcast if sequence is all same.
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/autovec/pr114028.c: New test.
---
gcc/config/riscv/riscv-v.cc   | 25 ++-
.../gcc.target/riscv/rvv/autovec/pr114028.c   | 25 +++
2 files changed, 49 insertions(+), 1 deletion(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr114028.c
 
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 0cfbd21ce6f..29d58deb995 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -443,6 +443,7 @@ public:
   }
   bool can_duplicate_repeating_sequence_p ();
+  bool is_repeating_sequence ();
   rtx get_merged_repeating_sequence ();
   bool repeating_sequence_use_merge_profitable_p ();
@@ -483,7 +484,8 @@ rvv_builder::can_duplicate_repeating_sequence_p ()
{
   poly_uint64 new_size = exact_div (full_nelts (), npatterns ());
   unsigned int new_inner_size = m_inner_bits_size * npatterns ();
-  if (!int_mode_for_size (new_inner_size, 0).exists (&m_new_inner_mode)
+  if (m_inner_mode == Pmode
+  || !int_mode_for_size (new_inner_size, 0).exists (&m_new_inner_mode)
   || GET_MODE_SIZE (m_new_inner_mode) > UNITS_PER_WORD
   || !get_vector_mode (m_new_inner_mode, new_size).exists (&m_new_mode))
 return false;
@@ -492,6 +494,18 @@ rvv_builder::can_duplicate_repeating_sequence_p ()
   return nelts_per_pattern () == 1;
}
+/* Return true if the vector is a simple sequence with one pattern and all
+   elements the same.  */
+bool
+rvv_builder::is_repeating_sequence ()
+{
+  if (npatterns () > 1)
+return false;
+  if (full_nelts ().is_constant ())
+return repeating_sequence_p (0, full_nelts ().to_constant (), 1);
+  return nelts_per_pattern () == 1;
+}
+
/* Return true if it is a repeating sequence that using
merge approach has better codegen than using default
approach (slide1down).
@@ -2544,6 +2558,15 @@ expand_vec_init (rtx target, rtx vals)
 v.quick_push (XVECEXP (vals, 0, i));
   v.finalize ();
+  /* If the sequence is v = { a, a, a, a } just broadcast an element.  */
+  if (v.is_repeating_sequence ())
+{
+  machine_mode mode = GET_MODE (target);
+  rtx dup = expand_vector_broadcast (mode, v.elt (0));
+  emit_move_insn (target, dup);
+  return;
+}
+
   if (nelts > 3)
 {
   /* Case 1: Convert v = { a, b, a, b } into v = { ab, ab }.  */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr114028.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr114028.c
new file mode 100644
index 000..a451d85e3fe
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr114028.c
@@ -0,0 +1,25 @@
+/* { dg-do compile }  */
+/* { dg-options "-march=rv64gcv_zvl256b -O3" } */
+
+int a, d = 55003;
+long c = 0, h;
+long e = 1;
+short i;
+
+int
+main ()
+{
+  for (int g = 0; g < 16; g++)
+{
+  d |= c;
+  short l = d;
+  i = l < 0 || a >> 4 ? d : a;
+  h = i - 8L;
+  e &= h;
+}
+
+  if (e != 1)
+__builtin_abort ();
+}
+
+/* { dg-final { scan-assembler-times "vmv\.v\.i\tv\[0-9\],0" 0 } } */
-- 
2.43.2
 


Re: PING: [PATCH] x86-64: Check R_X86_64_CODE_6_GOTTPOFF support

2024-02-22 Thread Hongtao Liu
On Thu, Feb 22, 2024 at 10:33 PM H.J. Lu  wrote:
>
> On Sun, Feb 18, 2024 at 8:02 AM H.J. Lu  wrote:
> >
> > If assembler and linker supports
> >
> > add %reg1, name@gottpoff(%rip), %reg2
> >
> > with R_X86_64_CODE_6_GOTTPOFF, we can generate it instead of
> >
> > mov name@gottpoff(%rip), %reg2
> > add %reg1, %reg2
x86 part LGTM, but I'm not familiar with the changes in config related files.
> >
> > gcc/
> >
> > * configure.ac (HAVE_AS_R_X86_64_CODE_6_GOTTPOFF): Defined as 1
> > if R_X86_64_CODE_6_GOTTPOFF is supported.
> > * config.in: Regenerated.
> > * configure: Likewise.
> > * config/i386/predicates.md (apx_ndd_add_memory_operand): Allow
> > UNSPEC_GOTNTPOFF if R_X86_64_CODE_6_GOTTPOFF is supported.
> >
> > gcc/testsuite/
> >
> > * gcc.target/i386/apx-ndd-tls-1b.c: New test.
> > * lib/target-supports.exp
> > (check_effective_target_code_6_gottpoff_reloc): New.
> > ---
> >  gcc/config.in |  7 +++
> >  gcc/config/i386/predicates.md |  6 +-
> >  gcc/configure | 62 +++
> >  gcc/configure.ac  | 37 +++
> >  .../gcc.target/i386/apx-ndd-tls-1b.c  |  9 +++
> >  gcc/testsuite/lib/target-supports.exp | 48 ++
> >  6 files changed, 168 insertions(+), 1 deletion(-)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/apx-ndd-tls-1b.c
> >
> > diff --git a/gcc/config.in b/gcc/config.in
> > index ce1d073833f..f3de4ba6776 100644
> > --- a/gcc/config.in
> > +++ b/gcc/config.in
> > @@ -737,6 +737,13 @@
> >  #endif
> >
> >
> > +/* Define 0/1 if your assembler and linker support 
> > R_X86_64_CODE_6_GOTTPOFF.
> > +   */
> > +#ifndef USED_FOR_TARGET
> > +#undef HAVE_AS_R_X86_64_CODE_6_GOTTPOFF
> > +#endif
> > +
> > +
> >  /* Define if your assembler supports relocs needed by -fpic. */
> >  #ifndef USED_FOR_TARGET
> >  #undef HAVE_AS_SMALL_PIC_RELOCS
> > diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
> > index 4c1aedd7e70..391f108c360 100644
> > --- a/gcc/config/i386/predicates.md
> > +++ b/gcc/config/i386/predicates.md
> > @@ -2299,10 +2299,14 @@ (define_predicate "apx_ndd_memory_operand"
> >
> >  ;; Return true if OP is a memory operand which can be used in APX NDD
> >  ;; ADD with register source operand.  UNSPEC_GOTNTPOFF memory operand
> > -;; isn't allowed with APX NDD ADD.
> > +;; is allowed with APX NDD ADD only if R_X86_64_CODE_6_GOTTPOFF works.
> >  (define_predicate "apx_ndd_add_memory_operand"
> >(match_operand 0 "memory_operand")
> >  {
> > +  /* OK if "add %reg1, name@gottpoff(%rip), %reg2" is supported.  */
> > +  if (HAVE_AS_R_X86_64_CODE_6_GOTTPOFF)
> > +return true;
> > +
> >op = XEXP (op, 0);
> >
> >/* Disallow APX NDD ADD with UNSPEC_GOTNTPOFF.  */
> > diff --git a/gcc/configure b/gcc/configure
> > index 41b978b0380..c59c971862c 100755
> > --- a/gcc/configure
> > +++ b/gcc/configure
> > @@ -29834,6 +29834,68 @@ cat >>confdefs.h <<_ACEOF
> >  _ACEOF
> >
> >
> > +if echo "$ld_ver" | grep GNU > /dev/null; then
> > +  if $gcc_cv_ld -V 2>/dev/null | grep elf_x86_64_sol2 > /dev/null; then
> > +ld_ix86_gld_64_opt="-melf_x86_64_sol2"
> > +  else
> > +ld_ix86_gld_64_opt="-melf_x86_64"
> > +  fi
> > +fi
> > +conftest_s='
> > +   .text
> > +   .globl  _start
> > +   .type _start, @function
> > +_start:
> > +   addq%r23,foo@GOTTPOFF(%rip), %r15
> > +   .section .tdata,"awT",@progbits
> > +   .type foo, @object
> > +foo:
> > +   .quad 0'
> > +{ $as_echo "$as_me:${as_lineno-$LINENO}: checking assembler for 
> > R_X86_64_CODE_6_GOTTPOFF reloc" >&5
> > +$as_echo_n "checking assembler for R_X86_64_CODE_6_GOTTPOFF reloc... " 
> > >&6; }
> > +if ${gcc_cv_as_x86_64_code_6_gottpoff+:} false; then :
> > +  $as_echo_n "(cached) " >&6
> > +else
> > +  gcc_cv_as_x86_64_code_6_gottpoff=no
> > +  if test x$gcc_cv_as != x; then
> > +$as_echo "$conftest_s" > conftest.s
> > +if { ac_try='$gcc_cv_as $gcc_cv_as_flags  -o conftest.o conftest.s >&5'
> > +  { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5
> > +  (eval $ac_try) 2>&5
> > +  ac_status=$?
> > +  $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
> > +  test $ac_status = 0; }; }
> > +then
> > +   if test x$gcc_cv_ld != x && test x$gcc_cv_objdump != x \
> > +   && test x$gcc_cv_readelf != x \
> > +   && $gcc_cv_readelf --relocs --wide conftest.o 2>&1 \
> > +  | grep R_X86_64_CODE_6_GOTTPOFF > /dev/null 2>&1 \
> > +   && $gcc_cv_ld $ld_ix86_gld_64_opt -o conftest conftest.o > 
> > /dev/null 2>&1; then
> > +  if $gcc_cv_objdump -dw conftest 2>&1 \
> > + | grep "add \+\$0xf\+8,%r23,%r15" > /dev/null 2>&1; then
> > +gcc_cv_as_x86_64_code_6_gottpoff=yes
> > +  else
> > +gcc_cv_as_x86_64_code_6_gottpoff=no

Re: [pushed] testsuite: fix Wmismatched-new-delete-8.C with -m32

2024-02-22 Thread Marek Polacek
On Thu, Feb 22, 2024 at 04:06:51PM -0800, Andrew Pinski wrote:
> On Thu, Feb 22, 2024, 15:56 Marek Polacek  wrote:
> 
> > Tested x86_64-pc-linux-gnu, applying to trunk.
> 
> 
> I backported/pushed the change to 13 branch already so please apply it
> there too.

Ah right.  Done.

Marek



Re: [pushed] testsuite: fix Wmismatched-new-delete-8.C with -m32

2024-02-22 Thread Andrew Pinski
On Thu, Feb 22, 2024, 15:56 Marek Polacek  wrote:

> Tested x86_64-pc-linux-gnu, applying to trunk.


I backported/pushed the change to 13 branch already so please apply it
there too.

Thanks,
Andrew




> -- >8 --
> This fixes
> error: 'operator new' takes type 'size_t' ('unsigned int') as first
> parameter [-fpermissive]
>
> gcc/testsuite/ChangeLog:
>
> * g++.dg/warn/Wmismatched-new-delete-8.C: Use __SIZE_TYPE__.
> ---
>  gcc/testsuite/g++.dg/warn/Wmismatched-new-delete-8.C | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/testsuite/g++.dg/warn/Wmismatched-new-delete-8.C
> b/gcc/testsuite/g++.dg/warn/Wmismatched-new-delete-8.C
> index 0ddc056c6df..e8fd7a85b8c 100644
> --- a/gcc/testsuite/g++.dg/warn/Wmismatched-new-delete-8.C
> +++ b/gcc/testsuite/g++.dg/warn/Wmismatched-new-delete-8.C
> @@ -11,7 +11,7 @@ static inline T * construct_at(void *at, ARGS && args)
>   struct Placeable : T
>   {
>Placeable(ARGS && args) : T(args) { }
> -  void * operator new (long unsigned int, void *ptr) { return ptr; }
> +  void * operator new (__SIZE_TYPE__, void *ptr) { return ptr; }
>void operator delete (void *, void *) { }
>   };
>   return new (at) Placeable(static_cast(args));
>
> base-commit: 37127ed975e09813eaa2d1cf1062055fce45dd16
> --
> 2.43.2
>
>


[pushed] testsuite: fix Wmismatched-new-delete-8.C with -m32

2024-02-22 Thread Marek Polacek
Tested x86_64-pc-linux-gnu, applying to trunk.

-- >8 --
This fixes
error: 'operator new' takes type 'size_t' ('unsigned int') as first parameter 
[-fpermissive]

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wmismatched-new-delete-8.C: Use __SIZE_TYPE__.
---
 gcc/testsuite/g++.dg/warn/Wmismatched-new-delete-8.C | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/g++.dg/warn/Wmismatched-new-delete-8.C 
b/gcc/testsuite/g++.dg/warn/Wmismatched-new-delete-8.C
index 0ddc056c6df..e8fd7a85b8c 100644
--- a/gcc/testsuite/g++.dg/warn/Wmismatched-new-delete-8.C
+++ b/gcc/testsuite/g++.dg/warn/Wmismatched-new-delete-8.C
@@ -11,7 +11,7 @@ static inline T * construct_at(void *at, ARGS && args)
  struct Placeable : T
  {
   Placeable(ARGS && args) : T(args) { }
-  void * operator new (long unsigned int, void *ptr) { return ptr; }
+  void * operator new (__SIZE_TYPE__, void *ptr) { return ptr; }
   void operator delete (void *, void *) { }
  };
  return new (at) Placeable(static_cast(args));

base-commit: 37127ed975e09813eaa2d1cf1062055fce45dd16
-- 
2.43.2



[r14-9138 Regression] FAIL: g++.dg/warn/Wmismatched-new-delete-8.C -std=gnu++20 (test for excess errors) on Linux/x86_64

2024-02-22 Thread haochen.jiang
On Linux/x86_64,

1076ffda6ce5e6d5fc9577deaf8233e549e5787a is the first bad commit
commit 1076ffda6ce5e6d5fc9577deaf8233e549e5787a
Author: Andrew Pinski 
Date:   Wed Feb 21 20:12:21 2024 -0800

warn-access: Fix handling of unnamed types [PR109804]

caused

FAIL: g++.dg/warn/Wmismatched-new-delete-8.C  -std=gnu++14 (test for excess 
errors)
FAIL: g++.dg/warn/Wmismatched-new-delete-8.C  -std=gnu++17 (test for excess 
errors)
FAIL: g++.dg/warn/Wmismatched-new-delete-8.C  -std=gnu++20 (test for excess 
errors)

with GCC configured with

../../gcc/configure 
--prefix=/export/users/haochenj/src/gcc-bisect/master/master/r14-9138/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=g++.dg/warn/Wmismatched-new-delete-8.C 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=g++.dg/warn/Wmismatched-new-delete-8.C 
--target_board='unix{-m32\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at haochen dot jiang at intel.com.)
(If you met problems with cascadelake related, disabling AVX512F in command 
line might save that.)
(However, please make sure that there is no potential problems with AVX512.)


[PATCH v1 00/13] Add aarch64-w64-mingw32 target

2024-02-22 Thread Evgeny Karpov
Hi Mark,

Thanks for testing the patch series!
It is great to know that EFI also works well.

Thank you for your contribution to this work!

Regards,
Evgeny

-Original Message-
Thursday, February 22, 2024 7:11 PM 
Mark Harmstone wrote:

Hi all,

Seems to work for me! Nice work.

It also works nicely with EFI as well, for anyone interested:

test.c:

#include 

EFI_STATUS efi_main(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE* SystemTable) {
 SystemTable->ConOut->OutputString(SystemTable->ConOut, L"hello, 
world\r\n");

 return EFI_SUCCESS;
}

$ aarch64-w64-mingw32-gcc -I/usr/include/efi -nostartfiles -Wl,--subsystem,10 
-eefi_main test.c -o test.efi

Mark


[PATCH v1 00/13] Add aarch64-w64-mingw32 target

2024-02-22 Thread Evgeny Karpov
Hi Maxim,

Thank you for the review and the test build!

Regards,
Evgeny

-Original Message-
Wednesday, February 21, 2024 7:27 PM 
Maxim Kuvyrkov wrote:

Hi Evgeny,

Great job!

For reference, here is a test build of this patch series using Linaro Toolchain 
CI: 
https://ci.linaro.org/view/tcwg-build/job/tcwg_gnu_mingw_build--master-woa64-build/9/artifact/artifacts/

--
Maxim Kuvyrkov
 


Re: [PATCH] libgccjit: Add support for setting the comment ident

2024-02-22 Thread Antoni Boucher
Thanks for the review.

Here's the updated patch. See answers to question below.

On Fri, 2024-01-05 at 14:39 -0500, David Malcolm wrote:
> On Fri, 2024-01-05 at 12:09 -0500, Antoni Boucher wrote:
> > Hi.
> > This patch adds support for setting the comment ident (analogous to
> > #ident "comment" in C).
> > Thanks for the review.
> 
> Thanks for the patch.
> 
> This may sound like a silly question, but what does #ident do and
> what
> is it used for?

This adds text to the .comment section.

> 
> FWIW I found this in our documentation:
>   https://gcc.gnu.org/onlinedocs/cpp/Other-Directives.html
> 
> [...snip...]
> 
> > +Output options
> > +**
> > +
> > +.. function:: void gcc_jit_context_set_output_ident
> > (gcc_jit_context *ctxt,\
> > + const char*
> > output_ident)
> > +
> > +   Set the identifier to write in the .comment section of the
> > output file to
> > +   ``output_ident``. Analogous to:
> 
> ...but only on some targets, according to the link above.  Maybe add
> that link here?
> 
> > +
> > +   .. code-block:: c
> > +
> > +  #ident "My comment"
> > +
> > +   in C.
> > +
> > +   This entrypoint was added in :ref:`LIBGCCJIT_ABI_26`; you can
> > test for
> > +   its presence using
> 
> Can the param "output_ident" be NULL?  It isn't checked for NULL in
> the
> patch's implementation of gcc_jit_context_set_output_ident, and
> recording::output_ident's constructor does check for it being NULL...
> 
> > +
> > +   .. code-block:: c
> > +
> > +  #ifdef LIBGCCJIT_HAVE_gcc_jit_context_set_output_ident
> 
> > diff --git a/gcc/jit/jit-playback.cc b/gcc/jit/jit-playback.cc
> > index 537f3b1..243a9fdf972 100644
> > --- a/gcc/jit/jit-playback.cc
> > +++ b/gcc/jit/jit-playback.cc
> > @@ -319,6 +319,13 @@ get_type (enum gcc_jit_types type_)
> >    return new type (type_node);
> >  }
> >  
> > +void
> > +playback::context::
> > +set_output_ident (const char* ident)
> > +{
> > +  targetm.asm_out.output_ident (ident);
> > +}
> > +
> 
> ...but looking at varasm.cc's default_asm_output_ident_directive it
> looks like the param must be non-NULL.
> 
> So this should either be conditionalized here to:
> 
>   if (ident)
>     targetm.asm_out.output_ident (ident);
> 
> or else we should ensure that "ident" is non-NULL at the API boundary
> and document this.

Ok, updated the patch to do this at the API boundary.

> 
> My guess is that it doesn't make sense to have a NULL ident, so we
> should go with the latter approach.
> 
> Can you have more than one #ident directive?  Presumably each one
> just
> adds another line to the generated asm, right?

Yes.

> 
> [...snip...]
> 
> > @@ -2185,6 +2192,52 @@ recording::string::write_reproducer
> > (reproducer &)
> >    /* Empty.  */
> >  }
> >  
> > +/* The implementation of class gcc::jit::recording::output_ident. 
> > */
> > +
> > +/* Constructor for gcc::jit::recording::output_ident, allocating a
> > +   copy of the given text using new char[].  */
> > +
> > +recording::output_ident::output_ident (context *ctxt, const char
> > *ident)
> > +: memento (ctxt)
> > +{
> > +  m_ident = ident ? xstrdup (ident) : NULL;
> > +}
> > +
> > +/* Destructor for gcc::jit::recording::output_ident.  */
> > +
> > +recording::output_ident::~output_ident ()
> > +{
> > +  delete[] m_ident;
> 
> m_ident is allocated above using xstrdup, so it must be cleaned up
> with
> "free"; I don't think it's safe to use "delete[]" here.
> 
> [...snip...]
> 
> > +/* Implementation of recording::memento::write_reproducer for
> > output_ident.  */
> > +
> > +void
> > +recording::output_ident::write_reproducer (reproducer &r)
> > +{
> > +  r.write ("  gcc_jit_context_set_output_ident (%s, \"%s\");",
> > +      r.get_identifier (get_context ()),
> > +      m_ident);
> 
> It isn't safe on all implementations to use %s with m_ident being
> NULL.

Now, m_ident is non-NULL.

> 
> [...snip...]
> 
> Thanks again for the patch; hope this is constructive
> Dave
> 

From 1e98faa4cc489641289cb8d6634e24efaa3d36a2 Mon Sep 17 00:00:00 2001
From: Antoni Boucher 
Date: Fri, 27 Oct 2023 17:36:03 -0400
Subject: [PATCH] libgccjit: Add support for setting the comment ident

gcc/jit/ChangeLog:

	* docs/topics/compatibility.rst (LIBGCCJIT_ABI_28): New ABI tag.
	* docs/topics/contexts.rst: Document gcc_jit_context_set_output_ident.
	* jit-playback.cc (set_output_ident): New method.
	* jit-playback.h (set_output_ident): New method.
	* jit-recording.cc (recording::context::set_output_ident,
	recording::output_ident::output_ident,
	recording::output_ident::~output_ident,
	recording::output_ident::replay_into,
	recording::output_ident::make_debug_string,
	recording::output_ident::write_reproducer): New methods.
	* jit-recording.h (class output_ident): New class.
	* libgccjit.cc (gcc_jit_context_set_output_ident): New function.
	* libgccjit.h (gcc_jit_context_set_output_ident): New function.
	* libgccjit.map: New function.

gcc/testsuite/ChangeLog:

	* jit.dg/all-non-f

Re: [PATCH] Fix fortran/PR114024

2024-02-22 Thread Harald Anlauf

On 2/22/24 22:01, Steve Kargl wrote:

On Thu, Feb 22, 2024 at 09:22:37PM +0100, Harald Anlauf wrote:

On the positive side, it not only seems to fix the cases in question,
but also substring references etc., like the following:


If the above passes a regression test, then by all means we should
use it.  I did not consider the substring case.  Even if unneeded
parentheses are inserted, which may cause generation of a temporary
variable, I hope users are not using 'allocate(x,source=z%re)' is
some deeply nested crazy loops structure.


First thing is code correctness.  There are cases where the
allocation shall preserve the array bounds, which is where
we must avoid the parentheses at all cost.  But these cases
should be very limited.  (There are some code comments/TODOs
regarding this and an open PR by Tobias(?)).

The cases we are currently discussing are even requiring(!)
the resetting of the lower bounds to 1, so your suggestion
to enforce parentheses does not look unreasonable.

BTW: If someone uses allocate in a tight loop, he/she deserves
to be punished anyway...


BTW, my patch and I suspect your improved patch also
fixes 'allocate(x,mold=z%re)'.  Consider,

complex z(3)
real, allocatable :: x(:)
z = 42ha
allocate(x, mold=z%re)
print *, size(x)
end

% gfortran13 -o z a.f90
a.f90:9:25:

 9 |allocate(x, mold=z%re)
   | 1
internal compiler error: in retrieve_last_ref, at fortran/trans-array.cc:6070
0x247d7a679 __libc_start1
 /usr/src/lib/libc/csu/libc_start1.c:157

% gfcx -o z a.f90 && ./z
3



Nice!  I completely forgot about MOLD...

So the only missing pieces are a really comprehensive testcase
and successful regtests...

Cheers,
Harald





Re: [PATCH 0/2 V2] aarch64: Place target independent and dependent code in one file.

2024-02-22 Thread Segher Boessenkool
On Thu, Feb 22, 2024 at 07:49:20PM +, Richard Sandiford wrote:
> Thanks for the update.  This is still quite hard to review though.
> Sorry to ask for another round, but could you split it up further?
> The ideal thing would be if patches that move code do nothing other
> than move code, and if patches that change code do those changes
> in-place.

In general, if there is a (big) part to the patch that does not change
behaviour at all, it should be a separate patch.  Such a patch is then
easy to review (write down in the commit message that it does not change
behaviour though, it helps reviewers).

It also makes the remaining tiny patches much easier to review then.

Very generally, any patch that makes interesting changes should not
have more than a few lines semantic content.  That can be repeated of
course, and have fall-out mechanical follow-up changes, but that's the
essence of good patchsets: one change per patch.

And then the commit message can be simple as well, and the chanegelog
will be easy to write.  That is the litmus test for good patch series :-)


Segher


[COMMITTED/13] warn-access: Fix handling of unnamed types [PR109804]

2024-02-22 Thread Andrew Pinski
This looks like an oversight of handling DEMANGLE_COMPONENT_UNNAMED_TYPE.
DEMANGLE_COMPONENT_UNNAMED_TYPE only has the u.s_number.number set while
the code expected newc.u.s_binary.left would be valid.
So this treats DEMANGLE_COMPONENT_UNNAMED_TYPE like we treat function paramaters
(DEMANGLE_COMPONENT_FUNCTION_PARAM) and template paramaters 
(DEMANGLE_COMPONENT_TEMPLATE_PARAM).

Note the code in the demangler does this when it sets 
DEMANGLE_COMPONENT_UNNAMED_TYPE:
  ret->type = DEMANGLE_COMPONENT_UNNAMED_TYPE;
  ret->u.s_number.number = num;

Committed as obvious after bootstrap/test on x86_64-linux-gnu

PR tree-optimization/109804

gcc/ChangeLog:

* gimple-ssa-warn-access.cc (new_delete_mismatch_p): Handle
DEMANGLE_COMPONENT_UNNAMED_TYPE.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wmismatched-new-delete-8.C: New test.

Signed-off-by: Andrew Pinski 
(cherry picked from commit 1076ffda6ce5e6d5fc9577deaf8233e549e5787a)
---
 gcc/gimple-ssa-warn-access.cc |  1 +
 .../g++.dg/warn/Wmismatched-new-delete-8.C| 42 +++
 2 files changed, 43 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/warn/Wmismatched-new-delete-8.C

diff --git a/gcc/gimple-ssa-warn-access.cc b/gcc/gimple-ssa-warn-access.cc
index 48e85e9cab5..cd02a02b1da 100644
--- a/gcc/gimple-ssa-warn-access.cc
+++ b/gcc/gimple-ssa-warn-access.cc
@@ -1700,6 +1700,7 @@ new_delete_mismatch_p (const demangle_component &newc,
 
 case DEMANGLE_COMPONENT_FUNCTION_PARAM:
 case DEMANGLE_COMPONENT_TEMPLATE_PARAM:
+case DEMANGLE_COMPONENT_UNNAMED_TYPE:
   return newc.u.s_number.number != delc.u.s_number.number;
 
 case DEMANGLE_COMPONENT_CHARACTER:
diff --git a/gcc/testsuite/g++.dg/warn/Wmismatched-new-delete-8.C 
b/gcc/testsuite/g++.dg/warn/Wmismatched-new-delete-8.C
new file mode 100644
index 000..0ddc056c6df
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wmismatched-new-delete-8.C
@@ -0,0 +1,42 @@
+/* PR tree-optimization/109804 */
+/* { dg-do compile { target c++11 } } */
+/* { dg-options "-Wall" } */
+
+/* Here we used to ICE in new_delete_mismatch_p because
+   we didn't handle unnamed types from the demangler 
(DEMANGLE_COMPONENT_UNNAMED_TYPE). */
+
+template 
+static inline T * construct_at(void *at, ARGS && args)
+{
+ struct Placeable : T
+ {
+  Placeable(ARGS && args) : T(args) { }
+  void * operator new (long unsigned int, void *ptr) { return ptr; }
+  void operator delete (void *, void *) { }
+ };
+ return new (at) Placeable(static_cast(args));
+}
+template 
+struct Reconstructible
+{
+  char _space[sizeof(MT)];
+  Reconstructible() { }
+};
+template 
+struct Constructible : Reconstructible
+{
+ Constructible(){}
+};
+struct A { };
+struct B
+{
+ Constructible a { };
+ B(int) { }
+};
+Constructible b { };
+void f()
+{
+  enum { ENUM_A = 1 };
+  enum { ENUM_B = 1 };
+  construct_at(b._space, ENUM_B);
+}
-- 
2.43.0



Re: [PATCH] combine: Don't simplify high part of paradoxical-SUBREG-of-MEM on machines that sign-extend loads [PR113010]

2024-02-22 Thread Jakub Jelinek
On Thu, Feb 22, 2024 at 12:59:18PM -0800, Greg McGary wrote:
> The sign bit of a sign-extending load cannot be known until runtime,
> so don't attempt to simplify it in the combiner.
> 
> 2024-02-22  Greg McGary  
> 
> PR rtl-optimization/113010
> * combine.cc (simplify_comparison): Don't simplify high part
>   of paradoxical-SUBREG-of-MEM on machines that sign-extend loads
> 
> * gcc.c-torture/execute/pr113010.c: New test.
> ---
>  gcc/combine.cc | 10 --
>  gcc/testsuite/gcc.c-torture/execute/pr113010.c |  9 +
>  2 files changed, 17 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr113010.c
> 
> diff --git a/gcc/combine.cc b/gcc/combine.cc
> index 812553c091e..736206242e1 100644
> --- a/gcc/combine.cc
> +++ b/gcc/combine.cc
> @@ -12550,9 +12550,15 @@ simplify_comparison (enum rtx_code code, rtx *pop0, 
> rtx *pop1)
>   }
>  
> /* If the inner mode is narrower and we are extracting the low part,
> -  we can treat the SUBREG as if it were a ZERO_EXTEND.  */
> +  we can treat the SUBREG as if it were a ZERO_EXTEND ...  */
> if (paradoxical_subreg_p (op0))
> - ;
> + {
> +   /* ... except we can't treat as ZERO_EXTEND when a machine
> +  automatically sign-extends loads. */
> +   if (MEM_P (SUBREG_REG (op0)) && WORD_REGISTER_OPERATIONS
> +   && load_extend_op (inner_mode) == SIGN_EXTEND)
> + break;

That doesn't feel sufficient.  Like in the PR112758 patch, I believe
for WORD_REGISTER_OPERATIONS you should treat it as a ZERO_EXTEND only
if MEM_P (SUBREG_REG (op0)) && load_extend_op (inner_mode) == ZERO_EXTEND
or if GET_MODE_PRECISION (inner_mode) is known to be >= BITS_PER_WORD.

Jakub



Re: [PATCH] Fix fortran/PR114024

2024-02-22 Thread Steve Kargl
On Thu, Feb 22, 2024 at 09:22:37PM +0100, Harald Anlauf wrote:
> Hi Steve!
> 
> On 2/22/24 01:52, Steve Kargl wrote:
> > On Wed, Feb 21, 2024 at 01:42:32PM -0800, Steve Kargl wrote:
> > > On Wed, Feb 21, 2024 at 10:20:43PM +0100, Harald Anlauf wrote:
> > > > On 2/21/24 22:00, Steve Kargl wrote:
> > > > > memleak vs ICE.  I think I'll take one over the other.
> > > > > Probably need to free code->expr3 before the copy.
> > > > 
> > > > Yep.
> > > > 
> > > > > I tried gfc_replace_expr in an earlier patch.  It did not
> > > > > work.
> > 
> > I tried freeing code->expr3 before assigning the new expression.
> > That leads to
> > 
> > % gfcx -c ~/gcc/gccx/gcc/testsuite/gfortran.dg/allocate_with_source_28.f90
> > pid 69473 comm f951 has trashed its stack, killing
> > gfortran: internal compiler error: Illegal instruction signal terminated 
> > program f951
> 
> Right.  I also don't see what the lifetimes of the expressions are.
> 
> But is the gfc_copy_expr really needed?  Wouldn't the following suffice?
> 
>   code->expr3 = gfc_get_parentheses (code->expr3);

It's been awhile since I use gfc_copy_expr, gfc_replace_expr, etc.
I did not try the above.  If that works, then we should use that
for simplicity.

> > If I don't free code->expr3 but simply assign the new
> > expression from gfc_get_parentheses(), your example
> > now compiles are executes are expected.  It now
> > allocate_with_source_28.f90.  Caveat:  I don't know
> > how to test the CLASS uu.
> > 
> > > > > > - it still fails on the following code, because the traversal
> > > > > > of the refs is incomplete / wrong:
> > > > > > 
> > > > > > program foo
> > > > > >  implicit none
> > > > > >  complex   :: cmp(3)
> > > > > >  real, pointer :: pp(:)
> > > > > >  class(*), allocatable :: uu(:)
> > > > > >  type t
> > > > > > real :: re
> > > > > > real :: im
> > > > > >  end type t
> > > > > >  type u
> > > > > > type(t) :: tt(3)
> > > > > >  end type u
> > > > > >  type(u) :: cc
> > > > > > 
> > > > > >  cmp = (3.45,6.78)
> > > > > >  cc% tt% re = cmp% re
> > > > > >  cc% tt% im = cmp% im
> > > > > >  allocate (pp, source = cc% tt% im)   ! ICE
> > > > > 
> > > > > cc%tt%im isn't a complex-part-ref, so this seems to
> > > > > be a different (maybe related) issue.  Does the code
> > > > > compile with 'source = (cc%tt%im)'?  If so, perhaps,
> > > > > detecting a component reference and doing the simply
> > > > > wrapping with parentheses can be done.
> > > > 
> > > > Yes, that's why I tried to make up the above example.
> > > > I think %re and %im are not too special, they work
> > > > here pretty much like component refs elsewhere.
> > > > 
> > > 
> > > I see.  The %re and %im complex-part-ref correspond to
> > > ref->u.i == INQUIRY_RE and INQUIRY_IM, respectively.
> > > A part-ref for a user-defined type doesn't have an
> > > INQUIRY_xxx, so we'll need to see if there is a way to
> > > easily identify, e.g., cc%tt%re from your testcase.
> > 
> > The attach patch uses ref->type == REF_COMPONENT to deal
> > with the above code.
> 
> I actually wanted to draw your attention away from the
> real/complex stuff, because that is not really the point.
> When do we actually need to enforce the parentheses?

This is essentially my concern.  I was inserting parentheses
only if I determined they were needed (to avoid unnecessary
temporary variable).  The code paththat enters the else portion
of the following if-else-stmt, where a temporary is created.
That is, 

allocate(x, source=z%re) becomes allocate(x, source=(z%re))
and from code generation viewpoint this is

tmp = (z%re)
allocate(x, sourcer=tmp)
deallocate(tmp)

> I tried the following, and it seems to work:
> 
>   if (code->expr3->expr_type == EXPR_VARIABLE
> && is_subref_array (code->expr3))
>   code->expr3 = gfc_get_parentheses (code->expr3);
> 
> (Beware: this is not regtested!)
> 
> On the positive side, it not only seems to fix the cases in question,
> but also substring references etc., like the following:

If the above passes a regression test, then by all means we should
use it.  I did not consider the substring case.  Even if unneeded
parentheses are inserted, which may cause generation of a temporary
variable, I hope users are not using 'allocate(x,source=z%re)' is
some deeply nested crazy loops structure.

BTW, my patch and I suspect your improved patch also
fixes 'allocate(x,mold=z%re)'.  Consider,

   complex z(3)
   real, allocatable :: x(:)
   z = 42
   allocate(x, mold=z%re)
   print *, size(x)
   end

% gfortran13 -o z a.f90
a.f90:9:25:

9 |allocate(x, mold=z%re)
  | 1
internal compiler error: in retrieve_last_ref, at fortran/trans-array.cc:6070
0x247d7a679 __libc_start1
/usr/src/lib/libc/csu/libc_start1.c:157

% gfcx -o z a.f90 && ./z
   3



-- 
Steve


[PATCH] combine: Don't simplify high part of paradoxical-SUBREG-of-MEM on machines that sign-extend loads [PR113010]

2024-02-22 Thread Greg McGary
The sign bit of a sign-extending load cannot be known until runtime,
so don't attempt to simplify it in the combiner.

2024-02-22  Greg McGary  

PR rtl-optimization/113010
* combine.cc (simplify_comparison): Don't simplify high part
of paradoxical-SUBREG-of-MEM on machines that sign-extend loads

* gcc.c-torture/execute/pr113010.c: New test.
---
 gcc/combine.cc | 10 --
 gcc/testsuite/gcc.c-torture/execute/pr113010.c |  9 +
 2 files changed, 17 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr113010.c

diff --git a/gcc/combine.cc b/gcc/combine.cc
index 812553c091e..736206242e1 100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -12550,9 +12550,15 @@ simplify_comparison (enum rtx_code code, rtx *pop0, 
rtx *pop1)
}
 
  /* If the inner mode is narrower and we are extracting the low part,
-we can treat the SUBREG as if it were a ZERO_EXTEND.  */
+we can treat the SUBREG as if it were a ZERO_EXTEND ...  */
  if (paradoxical_subreg_p (op0))
-   ;
+   {
+ /* ... except we can't treat as ZERO_EXTEND when a machine
+automatically sign-extends loads. */
+ if (MEM_P (SUBREG_REG (op0)) && WORD_REGISTER_OPERATIONS
+ && load_extend_op (inner_mode) == SIGN_EXTEND)
+   break;
+   }
  else if (subreg_lowpart_p (op0)
   && GET_MODE_CLASS (mode) == MODE_INT
   && is_int_mode (GET_MODE (SUBREG_REG (op0)), &inner_mode)
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr113010.c 
b/gcc/testsuite/gcc.c-torture/execute/pr113010.c
new file mode 100644
index 000..a95c613c1df
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr113010.c
@@ -0,0 +1,9 @@
+int minus_1 = -1;
+
+int
+main ()
+{
+  if ((0, 0xul) >= minus_1)
+__builtin_abort ();
+  return 0;
+}
-- 
2.34.1



Re: [PATCH] libgccjit: Support signed char flag

2024-02-22 Thread Antoni Boucher
Thanks for the review and idea.

Here's the updated patch. I added a test, but I could not set -fsigned-
char as this is not an option accepted by the jit frontend.
However, the test still works in the sense that it fails without this
patch and passes with it.
I'm just wondering if it would pass on all targets or if I should add a
target filtering directive to only execute on some target.
What do you think?

On Tue, 2024-01-09 at 11:01 -0500, David Malcolm wrote:
> On Thu, 2023-12-21 at 08:42 -0500, Antoni Boucher wrote:
> > Hi.
> > This patch adds support for the -fsigned-char flag.
> 
> Thanks.  The patch looks correct to me.
> 
> > I'm not sure how to test it since I stumbled upon this bug when I
> > found
> > this other bug
> > (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107863)
> > which is now fixed.
> > Any idea how I could test this patch?
> 
> We already document that GCC_JIT_TYPE_CHAR has "some signedness". 
> The
> bug being fixed here is that gcc_jit_context compilations were always
> treating "char" as unsigned, regardless of the value of -fsigned-char
> (either from the target's default, or as a context option), when it
> makes more sense to follow the C frontend's behavior.
> 
> So perhaps jit-written code with a context that has -fsigned-char as
> an
> option (via gcc_jit_context_add_command_line_option), and which
> promotes a negative char to a signed int, and then returns the result
> as an int?  Presumably if we're erroneously forcing "char" to be
> unsigned, the int will be in the range 0x80 to 0xff, rather that
> being
> negative.
> 
> Dave
> 

From 57d4bd695bcb16c54ecbe05346282f5dc270c30a Mon Sep 17 00:00:00 2001
From: Antoni Boucher 
Date: Mon, 3 Oct 2022 19:11:39 -0400
Subject: [PATCH] libgccjit: Support signed char flag

gcc/jit/ChangeLog:

	* dummy-frontend.cc (jit_langhook_init): Send flag_signed_char
	argument to build_common_tree_nodes.

gcc/testsuite/ChangeLog:

	* jit.dg/all-non-failing-tests.h: Add test-signed-char.c.
	* jit.dg/test-signed-char.c: New test.
---
 gcc/jit/dummy-frontend.cc|  2 +-
 gcc/testsuite/jit.dg/all-non-failing-tests.h | 10 
 gcc/testsuite/jit.dg/test-signed-char.c  | 52 
 3 files changed, 63 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/jit.dg/test-signed-char.c

diff --git a/gcc/jit/dummy-frontend.cc b/gcc/jit/dummy-frontend.cc
index dbeeacd17a8..dc1347b714a 100644
--- a/gcc/jit/dummy-frontend.cc
+++ b/gcc/jit/dummy-frontend.cc
@@ -1029,7 +1029,7 @@ jit_langhook_init (void)
   diagnostic_starter (global_dc) = jit_begin_diagnostic;
   diagnostic_finalizer (global_dc) = jit_end_diagnostic;
 
-  build_common_tree_nodes (false);
+  build_common_tree_nodes (flag_signed_char);
 
   build_common_builtin_nodes ();
 
diff --git a/gcc/testsuite/jit.dg/all-non-failing-tests.h b/gcc/testsuite/jit.dg/all-non-failing-tests.h
index 14a0a321550..404377a4df0 100644
--- a/gcc/testsuite/jit.dg/all-non-failing-tests.h
+++ b/gcc/testsuite/jit.dg/all-non-failing-tests.h
@@ -353,6 +353,13 @@
 /* test-setting-alignment.c: This can't be in the testcases array as it
is target-specific.  */
 
+/* test-signed-char.c */
+#define create_code create_code_signed_char
+#define verify_code verify_code_signed_char
+#include "test-signed-char.c"
+#undef create_code
+#undef verify_code
+
 /* test-sizeof.c */
 #define create_code create_code_sizeof
 #define verify_code verify_code_sizeof
@@ -560,6 +567,9 @@ const struct testcase testcases[] = {
   {"reflection",
create_code_reflection ,
verify_code_reflection },
+  {"signed-char",
+   create_code_signed_char,
+   verify_code_signed_char},
   {"sizeof",
create_code_sizeof,
verify_code_sizeof},
diff --git a/gcc/testsuite/jit.dg/test-signed-char.c b/gcc/testsuite/jit.dg/test-signed-char.c
new file mode 100644
index 000..c12b41d92cc
--- /dev/null
+++ b/gcc/testsuite/jit.dg/test-signed-char.c
@@ -0,0 +1,52 @@
+#include 
+#include 
+#include 
+
+#include "libgccjit.h"
+
+#include "harness.h"
+
+void
+create_code (gcc_jit_context *ctxt, void *user_data)
+{
+  /* Let's try to inject the equivalent of:
+int test_signed_char ()
+{
+char val = -2;
+return (int) val;
+}
+*/
+  gcc_jit_type *char_type =
+gcc_jit_context_get_type (ctxt, GCC_JIT_TYPE_CHAR);
+  gcc_jit_type *int_type =
+gcc_jit_context_get_type (ctxt, GCC_JIT_TYPE_INT);
+
+  gcc_jit_function *test_fn =
+gcc_jit_context_new_function (ctxt, NULL,
+  GCC_JIT_FUNCTION_EXPORTED,
+  int_type,
+  "test_signed_char",
+  0, NULL,
+  0);
+
+  gcc_jit_block *block = gcc_jit_function_new_block(test_fn, "entry");
+
+  gcc_jit_rvalue *val = gcc_jit_context_new_rvalue_from_int (ctxt,
+char_type, -2);
+  gcc_jit_rvalue *return_value = gcc_jit_context_new_cast (
+ctxt, NULL, val, int_type);
+
+  gcc_jit_block_end_with_return (block, NULL, return_value);
+}
+
+void
+verify_code (gcc_jit_context *ctxt, gcc_jit_result *resu

Re: [PATCH RFA] build: drop target libs from LD_LIBRARY_PATH [PR105688]

2024-02-22 Thread Iain Sandoe
Hi Gaius,

> On 22 Feb 2024, at 18:06, Gaius Mulley  wrote:
> 
> Iain Sandoe  writes:
> 
>> Right now, AFAIK the only target runtimes used by host tools are
>> libstdc++, libgcc and libgnat.  I agree that might change with rust -
>> since the rust folks are talking about using one of the runtimes in
>> the FE, I am not aware of other language FEs requiring their targte
>> runtimes to be available to the host tools (adding Gaius in case I
>> missed something with m2 - which is quite complex inthe
>> bootstrapping).

> the m2 infrastructure translates and builds gcc/m2/gm2-libs along with
> gcc/m2/gm2-compiler and uses these objects for cc1gm2, pge, mc etc -
> rather than the library archives generated from /libgm2

If I understand this (and my builds of the m2 stuff) correctly, this is done
locally to the builds of the host-side components; in particular not controlled
by the top level Makefile.{tpl,def}?

(so that we do not see builds of libgm2 in stage1/2- but only in the
stage3-target builds?

in which case, this should be outside the scope of the patch here.

Iain



Re: [patch] fix libsanitizer build with -D_TIME_BITS=64 -D_FILE_OFFSET_BITS=64 on 32bit architectures

2024-02-22 Thread Rainer Orth
Hi Matthias,

> libsanitizer fails to build with -D_TIME_BITS=64 -D_FILE_OFFSET_BITS=64,
> triggering an #error in /usr/include/features-time64.h
>
> --- a/libsanitizer/sanitizer_common/sanitizer_procmaps_solaris.cpp
> +++ b/libsanitizer/sanitizer_common/sanitizer_procmaps_solaris.cpp
> @@ -11,6 +11,7 @@
>
>  // Before Solaris 11.4,  doesn't work in a largefile
>  environment.
>  #undef _FILE_OFFSET_BITS
> +#undef _TIME_BITS
>  #include "sanitizer_platform.h"
>  #if SANITIZER_SOLARIS
>  #  include 
> --- a/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cpp
> +++ b/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cpp
> @@ -18,6 +18,7 @@
>  // depends on _FILE_OFFSET_BITS setting.
>  // To get this "true" dirent definition, we undefine _FILE_OFFSET_BITS
>  below.
>  #undef _FILE_OFFSET_BITS
> +#undef _TIME_BITS
>  #endif
>
>  // Must go after undef _FILE_OFFSET_BITS.
>
>
> The patch to sanitizer_platform_limits_posix.cpp is already present in
> trunk, but missing from the branches.
>
> Because all platform files are built in GCC, you also see the failure in
> sanitizer_procmaps_solaris.cpp. Just doing the same as for the posix 
> files fixes the issue and libsanitizer builds again.
>
> Does this have any effect on the solaris builds?  If not, ok for the trunk
> and the branches?

Since _TIME_BITS isn't used in Solaris system headers at all, there's no
impact.

However, the sanitizer_procmaps_solaris.cpp change needs to go into
upstream LLVM first and can only then be cherry-picked into libsanitizer
once it has been committed there.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH] Fix fortran/PR114024

2024-02-22 Thread Harald Anlauf

Hi Steve!

On 2/22/24 01:52, Steve Kargl wrote:

On Wed, Feb 21, 2024 at 01:42:32PM -0800, Steve Kargl wrote:

On Wed, Feb 21, 2024 at 10:20:43PM +0100, Harald Anlauf wrote:

On 2/21/24 22:00, Steve Kargl wrote:

memleak vs ICE.  I think I'll take one over the other.
Probably need to free code->expr3 before the copy.


Yep.


I tried gfc_replace_expr in an earlier patch.  It did not
work.



I tried freeing code->expr3 before assigning the new expression.
That leads to

% gfcx -c ~/gcc/gccx/gcc/testsuite/gfortran.dg/allocate_with_source_28.f90
pid 69473 comm f951 has trashed its stack, killing
gfortran: internal compiler error: Illegal instruction signal terminated 
program f951


Right.  I also don't see what the lifetimes of the expressions are.

But is the gfc_copy_expr really needed?  Wouldn't the following suffice?

  code->expr3 = gfc_get_parentheses (code->expr3);


If I don't free code->expr3 but simply assign the new
expression from gfc_get_parentheses(), your example
now compiles are executes are expected.  It now
allocate_with_source_28.f90.  Caveat:  I don't know
how to test the CLASS uu.


- it still fails on the following code, because the traversal
of the refs is incomplete / wrong:

program foo
 implicit none
 complex   :: cmp(3)
 real, pointer :: pp(:)
 class(*), allocatable :: uu(:)
 type t
real :: re
real :: im
 end type t
 type u
type(t) :: tt(3)
 end type u
 type(u) :: cc

 cmp = (3.45,6.78)
 cc% tt% re = cmp% re
 cc% tt% im = cmp% im
 allocate (pp, source = cc% tt% im)   ! ICE


cc%tt%im isn't a complex-part-ref, so this seems to
be a different (maybe related) issue.  Does the code
compile with 'source = (cc%tt%im)'?  If so, perhaps,
detecting a component reference and doing the simply
wrapping with parentheses can be done.


Yes, that's why I tried to make up the above example.
I think %re and %im are not too special, they work
here pretty much like component refs elsewhere.



I see.  The %re and %im complex-part-ref correspond to
ref->u.i == INQUIRY_RE and INQUIRY_IM, respectively.
A part-ref for a user-defined type doesn't have an
INQUIRY_xxx, so we'll need to see if there is a way to
easily identify, e.g., cc%tt%re from your testcase.


The attach patch uses ref->type == REF_COMPONENT to deal
with the above code.


I actually wanted to draw your attention away from the
real/complex stuff, because that is not really the point.
When do we actually need to enforce the parentheses?

I tried the following, and it seems to work:

  if (code->expr3->expr_type == EXPR_VARIABLE
  && is_subref_array (code->expr3))
code->expr3 = gfc_get_parentheses (code->expr3);

(Beware: this is not regtested!)

On the positive side, it not only seems to fix the cases in question,
but also substring references etc., like the following:

program foo
  implicit none
  complex   :: cmp(3) = (3.45,6.78)
  real, pointer :: pp(:)
  integer, allocatable  :: aa(:)
  class(*), allocatable :: uu(:), vv(:)
  type t   ! pseudo "complex" type
 real :: re
 real :: im
  end type t
  type ci  ! "complex integer" type
 integer :: re
 integer :: im
  end type ci
  type u
 type(t)  :: tt(3)
 type(ci) :: ii(3)
  end type u
  type(u) :: cc
  character(3)  :: str(3) = ["abc","def","ghi"]
  character(:), allocatable :: ac(:)

  allocate (ac, source=str(1::2)(2:3))
  print *, str(1::2)(2:3)
  call my_print (ac)
  cc% tt% re = cmp% re
  cc% tt% im = cmp% im
  cc% ii% re = nint (cmp% re)
  cc% ii% im = nint (cmp% im)
  print *, "re=", cc% tt% re
  print *, "im=", cc% tt% im
  allocate (pp, source = cc% tt% re)
  print *, pp
  allocate (uu, source = cc% tt% im)
  call my_print (uu)
  allocate (vv, source = cc% ii% im)
  call my_print (vv)
contains
  subroutine my_print (x)
class(*), intent(in) :: x(:)
select type (x)
type is (real)
   print *, "'real':", x
type is (integer)
   print *, "'integer':", x
type is (character(*))
   print *, "'character':", x
end select
  end subroutine my_print
end

Cheers,
Harald





[PATCH] RISC-V: Fix vec_init for simple sequences [PR114028].

2024-02-22 Thread Robin Dapp
Hi,

for a vec_init (_a, _a, _a, _a) with _a of mode DImode we try to
construct a "superword" of two "_a"s.  This only works for modes < Pmode
when we can "shift and or" two halves into one Pmode register.
This patch disallows the optimization for inner_mode == Pmode and emits
a simple broadcast in such a case.

The test is not a run test because it requires vlen=256 in qemu.
I can adjust that still of course.

Regtested on rv64, rv32 still running.

Regards
 Robin

gcc/ChangeLog:

PR target/114028

* config/riscv/riscv-v.cc 
(rvv_builder::can_duplicate_repeating_sequence_p):
Return false if inner mode is already Pmode.
(rvv_builder::is_all_same_sequence): New function.
(expand_vec_init): Emit broadcast if sequence is all same.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/pr114028.c: New test.
---
 gcc/config/riscv/riscv-v.cc   | 25 ++-
 .../gcc.target/riscv/rvv/autovec/pr114028.c   | 25 +++
 2 files changed, 49 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr114028.c

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 0cfbd21ce6f..29d58deb995 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -443,6 +443,7 @@ public:
   }
 
   bool can_duplicate_repeating_sequence_p ();
+  bool is_repeating_sequence ();
   rtx get_merged_repeating_sequence ();
 
   bool repeating_sequence_use_merge_profitable_p ();
@@ -483,7 +484,8 @@ rvv_builder::can_duplicate_repeating_sequence_p ()
 {
   poly_uint64 new_size = exact_div (full_nelts (), npatterns ());
   unsigned int new_inner_size = m_inner_bits_size * npatterns ();
-  if (!int_mode_for_size (new_inner_size, 0).exists (&m_new_inner_mode)
+  if (m_inner_mode == Pmode
+  || !int_mode_for_size (new_inner_size, 0).exists (&m_new_inner_mode)
   || GET_MODE_SIZE (m_new_inner_mode) > UNITS_PER_WORD
   || !get_vector_mode (m_new_inner_mode, new_size).exists (&m_new_mode))
 return false;
@@ -492,6 +494,18 @@ rvv_builder::can_duplicate_repeating_sequence_p ()
   return nelts_per_pattern () == 1;
 }
 
+/* Return true if the vector is a simple sequence with one pattern and all
+   elements the same.  */
+bool
+rvv_builder::is_repeating_sequence ()
+{
+  if (npatterns () > 1)
+return false;
+  if (full_nelts ().is_constant ())
+return repeating_sequence_p (0, full_nelts ().to_constant (), 1);
+  return nelts_per_pattern () == 1;
+}
+
 /* Return true if it is a repeating sequence that using
merge approach has better codegen than using default
approach (slide1down).
@@ -2544,6 +2558,15 @@ expand_vec_init (rtx target, rtx vals)
 v.quick_push (XVECEXP (vals, 0, i));
   v.finalize ();
 
+  /* If the sequence is v = { a, a, a, a } just broadcast an element.  */
+  if (v.is_repeating_sequence ())
+{
+  machine_mode mode = GET_MODE (target);
+  rtx dup = expand_vector_broadcast (mode, v.elt (0));
+  emit_move_insn (target, dup);
+  return;
+}
+
   if (nelts > 3)
 {
   /* Case 1: Convert v = { a, b, a, b } into v = { ab, ab }.  */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr114028.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr114028.c
new file mode 100644
index 000..a451d85e3fe
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr114028.c
@@ -0,0 +1,25 @@
+/* { dg-do compile }  */
+/* { dg-options "-march=rv64gcv_zvl256b -O3" } */
+
+int a, d = 55003;
+long c = 0, h;
+long e = 1;
+short i;
+
+int
+main ()
+{
+  for (int g = 0; g < 16; g++)
+{
+  d |= c;
+  short l = d;
+  i = l < 0 || a >> 4 ? d : a;
+  h = i - 8L;
+  e &= h;
+}
+
+  if (e != 1)
+__builtin_abort ();
+}
+
+/* { dg-final { scan-assembler-times "vmv\.v\.i\tv\[0-9\],0" 0 } } */
-- 
2.43.2


Re: [PATCH 0/2 V2] aarch64: Place target independent and dependent code in one file.

2024-02-22 Thread Richard Sandiford
Ajit Agarwal  writes:
> Hello Alex/Richard:
>
> I have placed target indpendent and target dependent code in
> aarch64-ldp-fusion for load store fusion.
>
> Common infrastructure of load store pair fusion is divided into
> target independent and target dependent code.
>
> Target independent code is the Generic code with pure virtual
> function to interface betwwen target independent and dependent
> code.
>
> Target dependent code is the implementation of pure virtual
> function for aarch64 target and the call to target independent
> code.

Thanks for the update.  This is still quite hard to review though.
Sorry to ask for another round, but could you split it up further?
The ideal thing would be if patches that move code do nothing other
than move code, and if patches that change code do those changes
in-place.

Richard

>
> Bootstrapped in aarch64-linux-gnu.
>
> Thanks & Regards
> Ajit
>
>
> aarch64: Place target independent and dependent code in one file.
>
> Common infrastructure of load store pair fusion is divided into
> target independent and target dependent code.
>
> Target independent code is the Generic code with pure virtual
> function to interface betwwen target independent and dependent
> code.
>
> Target dependent code is the implementation of pure virtual
> function for aarch64 target and the call to target independent
> code.
>
> 2024-02-15  Ajit Kumar Agarwal  
>
> gcc/ChangeLog:
>
>   * config/aarch64/aarch64-ldp-fusion.cc: Place target
>   independent and dependent code.
> ---
>  gcc/config/aarch64/aarch64-ldp-fusion.cc | 3513 --
>  1 file changed, 1842 insertions(+), 1671 deletions(-)
>
> diff --git a/gcc/config/aarch64/aarch64-ldp-fusion.cc 
> b/gcc/config/aarch64/aarch64-ldp-fusion.cc
> index 22ed95eb743..0ab842e2bbb 100644
> --- a/gcc/config/aarch64/aarch64-ldp-fusion.cc
> +++ b/gcc/config/aarch64/aarch64-ldp-fusion.cc
> @@ -17,6 +17,7 @@
>  // along with GCC; see the file COPYING3.  If not see
>  // .
>  
> +
>  #define INCLUDE_ALGORITHM
>  #define INCLUDE_FUNCTIONAL
>  #define INCLUDE_LIST
> @@ -37,13 +38,12 @@
>  #include "tree-hash-traits.h"
>  #include "print-tree.h"
>  #include "insn-attr.h"
> -
>  using namespace rtl_ssa;
>  
> -static constexpr HOST_WIDE_INT LDP_IMM_BITS = 7;
> -static constexpr HOST_WIDE_INT LDP_IMM_SIGN_BIT = (1 << (LDP_IMM_BITS - 1));
> -static constexpr HOST_WIDE_INT LDP_MAX_IMM = LDP_IMM_SIGN_BIT - 1;
> -static constexpr HOST_WIDE_INT LDP_MIN_IMM = -LDP_MAX_IMM - 1;
> +static constexpr HOST_WIDE_INT PAIR_MEM_IMM_BITS = 7;
> +static constexpr HOST_WIDE_INT PAIR_MEM_IMM_SIGN_BIT = (1 << 
> (PAIR_MEM_IMM_BITS - 1));
> +static constexpr HOST_WIDE_INT PAIR_MEM_MAX_IMM = PAIR_MEM_IMM_SIGN_BIT - 1;
> +static constexpr HOST_WIDE_INT PAIR_MEM_MIN_IMM = -PAIR_MEM_MAX_IMM - 1;
>  
>  // We pack these fields (load_p, fpsimd_p, and size) into an integer
>  // (LFS) which we use as part of the key into the main hash tables.
> @@ -138,8 +138,144 @@ struct alt_base
>poly_int64 offset;
>  };
>  
> +// Class that implements a state machine for building the changes needed to 
> form
> +// a store pair instruction.  This allows us to easily build the changes in
> +// program order, as required by rtl-ssa.
> +struct stp_change_builder
> +{
> +  enum class state
> +  {
> +FIRST,
> +INSERT,
> +FIXUP_USE,
> +LAST,
> +DONE
> +  };
> +
> +  enum class action
> +  {
> +TOMBSTONE,
> +CHANGE,
> +INSERT,
> +FIXUP_USE
> +  };
> +
> +  struct change
> +  {
> +action type;
> +insn_info *insn;
> +  };
> +
> +  bool done () const { return m_state == state::DONE; }
> +
> +  stp_change_builder (insn_info *insns[2],
> +   insn_info *repurpose,
> +   insn_info *dest)
> +: m_state (state::FIRST), m_insns { insns[0], insns[1] },
> +  m_repurpose (repurpose), m_dest (dest), m_use (nullptr) {}
> +
> +  change get_change () const
> +  {
> +switch (m_state)
> +  {
> +  case state::FIRST:
> + return {
> +   m_insns[0] == m_repurpose ? action::CHANGE : action::TOMBSTONE,
> +   m_insns[0]
> + };
> +  case state::LAST:
> + return {
> +   m_insns[1] == m_repurpose ? action::CHANGE : action::TOMBSTONE,
> +   m_insns[1]
> + };
> +  case state::INSERT:
> + return { action::INSERT, m_dest };
> +  case state::FIXUP_USE:
> + return { action::FIXUP_USE, m_use->insn () };
> +  case state::DONE:
> + break;
> +  }
> +
> +gcc_unreachable ();
> +  }
> +
> +  // Transition to the next state.
> +  void advance ()
> +  {
> +switch (m_state)
> +  {
> +  case state::FIRST:
> + if (m_repurpose)
> +   m_state = state::LAST;
> + else
> +   m_state = state::INSERT;
> + break;
> +  case state::INSERT:
> +  {
> + def_info *def = memory_access (m_insns[0]->defs ());
> + while (*def->next_def ()->insn () <= *m_dest)
> +   def = def->next_def ();
> +
> +  

[patch] fix libsanitizer build with -D_TIME_BITS=64 -D_FILE_OFFSET_BITS=64 on 32bit architectures

2024-02-22 Thread Matthias Klose
libsanitizer fails to build with -D_TIME_BITS=64 -D_FILE_OFFSET_BITS=64, 
triggering an #error in /usr/include/features-time64.h


--- a/libsanitizer/sanitizer_common/sanitizer_procmaps_solaris.cpp
+++ b/libsanitizer/sanitizer_common/sanitizer_procmaps_solaris.cpp
@@ -11,6 +11,7 @@

 // Before Solaris 11.4,  doesn't work in a largefile 
environment.

 #undef _FILE_OFFSET_BITS
+#undef _TIME_BITS
 #include "sanitizer_platform.h"
 #if SANITIZER_SOLARIS
 #  include 
--- a/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cpp
+++ b/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cpp
@@ -18,6 +18,7 @@
 // depends on _FILE_OFFSET_BITS setting.
 // To get this "true" dirent definition, we undefine _FILE_OFFSET_BITS 
below.

 #undef _FILE_OFFSET_BITS
+#undef _TIME_BITS
 #endif

 // Must go after undef _FILE_OFFSET_BITS.


The patch to sanitizer_platform_limits_posix.cpp is already present in 
trunk, but missing from the branches.


Because all platform files are built in GCC, you also see the failure in 
sanitizer_procmaps_solaris.cpp. Just doing the same as for the posix 
files fixes the issue and libsanitizer builds again.


Does this have any effect on the solaris builds?  If not, ok for the 
trunk and the branches?


Matthias


Re: [PATCH v1] RISC-V: Upgrade RVV intrinsic version to 0.12

2024-02-22 Thread Palmer Dabbelt

On Wed, 21 Feb 2024 16:02:50 PST (-0800), Kito Cheng wrote:

Palmer Dabbelt  於 2024年2月22日 週四 07:42 寫道:


On Wed, 21 Feb 2024 15:34:32 PST (-0800), Kito Cheng wrote:
> LGTM for the patch
>
> Li, Pan2  於 2024年2月21日 週三 12:31 寫道:
>
>> Hi kito and juzhe.
>>
>> There may be 2 items for double-confirm. Thanks a lot.
>>
>> 1. Not very sure if we need to upgrade the version for
>> __riscv_th_v_intrinsic.
>>
>
> Yes since 0.11 and 0.12 is not really compatible

Where are the incompatibilities?  The whole reason we accepted the
intrinsics in the first place is because the RVI folks said they
wouldn't break compatibility, if that's changed then just dropping the
old version is going to break users.



0.12 have interface for segment load store and new fixed points intrinsic
compare to 0.11, the first one item is not incompatible change since it's
new added and gcc 13 isn't implemented the legacy one, the later one is
kinda broken on both llvm and gcc which is made is not really useful in
practice.

Other than that, everything are same, it's not 100% compatible so I am not
intend to cheating my self to say it's compatible, but we do think it's
necessary evil since fixing point stuff are not right design and
implementation.


OK, those don't seem so scary.  So maybe let's just put it in a NEWS 
entry or something?  It's mildly interesting to users, but I agree the 
earlier intrinsics spec was vague enough in some areas we can get away 
with the diffs I've seen.



Anyway it's became frozen mode, 1.0 rc0 has been tagged, no API will
change/remove.


OK, so I guess we should move to 1.0, then?  Are you guys going to pick 
that up?






> 2. Do we need to upgrade the even a newer version (like 1.0) for the GCC
14
>> release, or we can do it later.
>>
>
> Yeah, Ideal case is we can update that before release made :p
>
>
>
>
>> Pan
>>
>> -Original Message-
>> From: Li, Pan2 
>> Sent: Wednesday, February 21, 2024 12:27 PM
>> To: gcc-patches@gcc.gnu.org
>> Cc: juzhe.zh...@rivai.ai; Li, Pan2 ; Wang, Yanzhang
<
>> yanzhang.w...@intel.com>; kito.ch...@gmail.com
>> Subject: [PATCH v1] RISC-V: Upgrade RVV intrinsic version to 0.12
>>
>> From: Pan Li 
>>
>> Upgrade the version of RVV intrinsic from 0.11 to 0.12.
>>
>> PR target/114017
>>
>> gcc/ChangeLog:
>>
>> * config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins): Upgrade
>> the version to 0.12.
>>
>> gcc/testsuite/ChangeLog:
>>
>> * gcc.target/riscv/predef-__riscv_v_intrinsic.c: Update the
>> version to 0.12.
>> * gcc.target/riscv/rvv/base/pr114017-1.c: New test.
>>
>> Signed-off-by: Pan Li 
>> ---
>>  gcc/config/riscv/riscv-c.cc   |  2 +-
>>  .../riscv/predef-__riscv_v_intrinsic.c|  2 +-
>>  .../gcc.target/riscv/rvv/base/pr114017-1.c| 19 +++
>>  3 files changed, 21 insertions(+), 2 deletions(-)
>>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114017-1.c
>>
>> diff --git a/gcc/config/riscv/riscv-c.cc b/gcc/config/riscv/riscv-c.cc
>> index 3ef06dcfd2d..3755ec0b8ef 100644
>> --- a/gcc/config/riscv/riscv-c.cc
>> +++ b/gcc/config/riscv/riscv-c.cc
>> @@ -139,7 +139,7 @@ riscv_cpu_cpp_builtins (cpp_reader *pfile)
>>  {
>>builtin_define ("__riscv_vector");
>>builtin_define_with_int_value ("__riscv_v_intrinsic",
>> -riscv_ext_version_value (0, 11));
>> +riscv_ext_version_value (0, 12));
>>  }
>>
>> if (TARGET_XTHEADVECTOR)
>> diff --git a/gcc/testsuite/gcc.target/riscv/predef-__riscv_v_intrinsic.c
>> b/gcc/testsuite/gcc.target/riscv/predef-__riscv_v_intrinsic.c
>> index dbbedf54f87..07f1f159a8f 100644
>> --- a/gcc/testsuite/gcc.target/riscv/predef-__riscv_v_intrinsic.c
>> +++ b/gcc/testsuite/gcc.target/riscv/predef-__riscv_v_intrinsic.c
>> @@ -3,7 +3,7 @@
>>
>>  int main () {
>>
>> -#if __riscv_v_intrinsic != 11000
>> +#if __riscv_v_intrinsic != 12000
>>  #error "__riscv_v_intrinsic"
>>  #endif
>>
>> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr114017-1.c
>> b/gcc/testsuite/gcc.target/riscv/rvv/base/pr114017-1.c
>> new file mode 100644
>> index 000..8eee7c68f71
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr114017-1.c
>> @@ -0,0 +1,19 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */
>> +
>> +#include "riscv_vector.h"
>> +
>> +vuint8mf2_t
>> +test (vuint16m1_t val, size_t shift, size_t vl)
>> +{
>> +#if __riscv_v_intrinsic == 11000
>> +  #warning "RVV Intrinsics v0.11"
>> +  return __riscv_vnclipu (val, shift, vl);
>> +#endif
>> +
>> +#if __riscv_v_intrinsic == 12000
>> +  #warning "RVV Intrinsics v0.12" /* { dg-warning "RVV Intrinsics
v0.12"
>> } */
>> +  return __riscv_vnclipu (val, shift, 0, vl);
>> +#endif
>> +}
>> +
>> --
>> 2.34.1
>>
>>



Re: [PATCH] c++: -Wuninitialized when binding a ref to uninit DM [PR113987]

2024-02-22 Thread Marek Polacek
On Thu, Feb 22, 2024 at 08:34:45AM +, Jason Merrill wrote:
> On 2/20/24 19:15, Marek Polacek wrote:
> > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > 
> > -- >8 --
> > This PR asks that our -Wuninitialized for mem-initializers does
> > not warn when binding a reference to an uninitialized data member.
> > We already check !INDIRECT_TYPE_P in find_uninit_fields_r, but
> > that won't catch binding a parameter of a reference type to an
> > uninitialized field, as in:
> > 
> >struct S { S (int&); };
> >struct T {
> >T() : s(i) {}
> >S s;
> >int i;
> >};
> > 
> > This patch adds a new function to handle this case.
> 
> For type_build_ctor_call types like S, it's weird that we currently
> find_uninit_fields before building the initialization.  What if we move the
> check after the build_aggr_init so we have the actual initializer instead of
> just the expression?

Thanks.  I've tried but unfortunately I'm not getting anywhere.  One
problem is that immediately after the find_uninit_fields call we may
change the TREE_LIST in

  if (init && TREE_CODE (init) == TREE_LIST)
 //...

so we'd have to cope with that somehow.  Sinking find_uninit_fields
into one of the conditions below looks like a complication.  Another
problem is that calling find_uninit_fields on the result of
build_aggr_init call causes a bogus warning: we create something like
E::E (&((struct F *) this)->e, ((struct F *) this)->a)
and then warn that the this object is uninitialized.  So I'm not sure
if that fix would be simpler.

Marek



[patch, libgfortran] PR105456 Child I/O does not propage iostat

2024-02-22 Thread Jerry D

Hi all,

The attached fix adds a check for an error condition from a UDDTIO 
procedure in the case where there is no actual underlying error, but the 
user defines an error by setting the iostat variable manually before 
returning to the parent READ.


I did not address the case of a formatted WRITE or unformatted 
READ/WRITE until I get some feedback on the approach. If this approach 
is OK I would like to commit and then do a separate patch for the cases 
I just mentioned.


Feedback appreciated.  Regression tested on x86_64. OK for trunk?

Jerry

Author: Jerry DeLisle 
Date:   Thu Feb 22 10:48:39 2024 -0800

libgfortran: Propagate user defined iostat and iomsg.

PR libfortran/105456

libgfortran/ChangeLog:

* io/list_read.c (list_formatted_read_scalar): Add checks
for the case where a user defines their own error codes
and error messages and generate the runtime error.

gcc/testsuite/ChangeLog:

* gfortran.dg/pr105456.f90: New test.diff --git a/gcc/testsuite/gfortran.dg/pr105456.f90 b/gcc/testsuite/gfortran.dg/pr105456.f90
new file mode 100644
index 000..411873f4aed
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr105456.f90
@@ -0,0 +1,41 @@
+! { dg-do run }
+! { dg-shouldfail "The users message" }
+module sk1
+  implicit none
+  type char
+ character :: ch
+  end type char
+  interface read (formatted)
+ module procedure read_formatted
+  end interface read (formatted)
+contains
+  subroutine read_formatted (dtv, unit, iotype, vlist, piostat, piomsg)
+class (char), intent(inout) :: dtv
+integer, intent(in) :: unit
+character (len=*), intent(in) :: iotype
+integer, intent(in) :: vlist(:)
+integer, intent(out) :: piostat
+character (len=*), intent(inout) :: piomsg
+character :: ch
+read (unit,fmt='(A1)', advance="no", iostat=piostat, iomsg=piomsg) ch
+piostat = 42
+piomsg="The users message"
+dtv%ch = ch
+  end subroutine read_formatted
+end module sk1
+
+program skip1
+  use sk1
+  implicit none
+  integer :: myerror = 0
+  character(64) :: mymessage = ""
+  type (char) :: x
+  open (10,status="scratch")
+  write (10,'(A)') '', 'a'
+  rewind (10)
+  read (10,*) x
+  print *, myerror, mymessage
+  write (*,'(10(A))') "Read: '",x%ch,"'"
+end program skip1
+! { dg-output ".*(unit = 10, file = .*)" }
+! { dg-output "Fortran runtime error: The users message" }
diff --git a/libgfortran/io/list_read.c b/libgfortran/io/list_read.c
index 3d29cb64813..ee3ab713519 100644
--- a/libgfortran/io/list_read.c
+++ b/libgfortran/io/list_read.c
@@ -2138,6 +2138,7 @@ static int
 list_formatted_read_scalar (st_parameter_dt *dtp, bt type, void *p,
 			int kind, size_t size)
 {
+  char message[MSGLEN];
   gfc_char4_t *q, *r;
   size_t m;
   int c;
@@ -2247,7 +2248,7 @@ list_formatted_read_scalar (st_parameter_dt *dtp, bt type, void *p,
 	  child_iostat = ((dtp->common.flags & IOPARM_HAS_IOSTAT)
 			  ? dtp->common.iostat : &noiostat);
 
-	  /* Set iomsge, intent(inout).  */
+	  /* Set iomsg, intent(inout).  */
 	  if (dtp->common.flags & IOPARM_HAS_IOMSG)
 	{
 	  child_iomsg = dtp->common.iomsg;
@@ -2266,6 +2267,25 @@ list_formatted_read_scalar (st_parameter_dt *dtp, bt type, void *p,
 			  iotype_len, child_iomsg_len);
 	  dtp->u.p.child_saved_iostat = *child_iostat;
 	  dtp->u.p.current_unit->child_dtio--;
+
+
+	  if ((dtp->u.p.child_saved_iostat != 0) &&
+	  !(dtp->common.flags & IOPARM_HAS_IOMSG) &&
+	  !(dtp->common.flags & IOPARM_HAS_IOSTAT))
+	{
+	  /* Trim trailing spaces from the message.  */
+	  for(int i = IOMSG_LEN - 1; i > 0; i--)
+		if (!isspace(child_iomsg[i]))
+		  {
+		/* Add two to get back to the end of child_iomsg.  */
+		child_iomsg_len = i+2;
+		break;
+		  }
+	  free_line (dtp);
+	  snprintf (message, child_iomsg_len, child_iomsg);
+	  generate_error (&dtp->common, dtp->u.p.child_saved_iostat,
+			  message);
+	}
   }
   break;
 default:


RE: [PATCH] [X86_64]: Enable support for next generation AMD Zen5 CPU with znver5 scheduler Model

2024-02-22 Thread Anbazhagan, Karthiban
[Public]

Hi,

PFA, The patch that enables support for the next generation AMD Zen5 
CPU via -march=znver5 with basic znver5 scheduler Model.
znver5 scheduler model is combined with existing znver4 scheduler model 
into a single file "zn4zn5.md".

automata size tested using command :  size -A gcc/insn-automata.o
before patch: 1575958
After patch: 1670964

Thanks and Regards
Karthiban

-Original Message-
From: Anbazhagan, Karthiban
Sent: Wednesday, February 14, 2024 6:54 PM
To: Jan Hubicka 
Cc: gcc-patches@gcc.gnu.org; Kumar, Venkataramanan 
; Joshi, Tejas Sanjay 
; Nagarajan, Muthu kumar raj 
; Gopalasubramanian, Ganesh 

Subject: RE: [PATCH] [X86_64]: Enable support for next generation AMD Zen5 CPU 
with znver5 scheduler Model

Hi,

>>I assume the znver5 costs are smae as znver4 so far?

Costing table updated for below entries.
+  {COSTS_N_INSNS (10), /* cost of a divide/mod for QI.  */
+   COSTS_N_INSNS (11), /*  HI.  */
+   COSTS_N_INSNS (16), /*  DI.  */
+   COSTS_N_INSNS (16)},/*  
other.  */
+  COSTS_N_INSNS (10),  /* cost of DIVSS instruction.  
*/
+  COSTS_N_INSNS (14),  /* cost of SQRTSS instruction.  
*/
+  COSTS_N_INSNS (20),  /* cost of SQRTSD instruction.  
*/


>> we can just change znver4.md to also work for znver5?
We will combine znver4 and znver5 scheduler descriptions into one

Thanks and Regards
Karthiban

-Original Message-
From: Jan Hubicka 
Sent: Monday, February 12, 2024 9:30 PM
To: Anbazhagan, Karthiban 
Cc: gcc-patches@gcc.gnu.org; Kumar, Venkataramanan 
; Joshi, Tejas Sanjay 
; Nagarajan, Muthu kumar raj 
; Gopalasubramanian, Ganesh 

Subject: Re: [PATCH] [X86_64]: Enable support for next generation AMD Zen5 CPU 
with znver5 scheduler Model

Caution: This message originated from an External Source. Use proper caution 
when opening attachments, clicking links, or responding.


Hi,
> gcc/ChangeLog:
> * common/config/i386/cpuinfo.h (get_amd_cpu): Recognize znver5.
> * common/config/i386/i386-common.cc (processor_names): Add znver5.
> (processor_alias_table): Likewise.
> * common/config/i386/i386-cpuinfo.h (processor_types): Add new zen
> family.
> (processor_subtypes): Add znver5.
> * config.gcc (x86_64-*-* |...): Likewise.
> * config/i386/driver-i386.cc (host_detect_local_cpu): Let
> march=native detect znver5 cpu's.
> * config/i386/i386-c.cc (ix86_target_macros_internal): Add znver5.
> * config/i386/i386-options.cc (m_ZNVER5): New definition
> (processor_cost_table): Add znver5.
> * config/i386/i386.cc (ix86_reassociation_width): Likewise.
> * config/i386/i386.h (processor_type): Add PROCESSOR_ZNVER5
> (PTA_ZNVER5): New definition.
> * config/i386/i386.md (define_attr "cpu"): Add znver5.
> (Scheduling descriptions) Add znver5.md.
> * config/i386/x86-tune-costs.h (znver5_cost): New definition.
> * config/i386/x86-tune-sched.cc (ix86_issue_rate): Add znver5.
> (ix86_adjust_cost): Likewise.
> * config/i386/x86-tune.def (avx512_move_by_pieces): Add m_ZNVER5.
> (avx512_store_by_pieces): Add m_ZNVER5.
> * doc/extend.texi: Add znver5.
> * doc/invoke.texi: Likewise.
> * config/i386/znver5.md: New.
>
> gcc/testsuite/ChangeLog:
> * g++.target/i386/mv29.C: Handle znver5 arch.
> * gcc.target/i386/funcspec-56.inc:Likewise.
> +/* This table currently replicates znver4_cost table. */ struct
> +processor_costs znver5_cost = {

I assume the znver5 costs are smae as znver4 so far?

> +;; AMD znver5 Scheduling
> +;; Modeling automatons for zen decoders, integer execution pipes, ;;
> +AGU pipes, branch, floating point execution and fp store units.
> +(define_automaton "znver5, znver5_ieu, znver5_idiv, znver5_fdiv,
> +znver5_agu, znver5_fpu, znver5_fp_store")
> +
> +;; Decoders unit has 4 decoders and all of them can decode fast path
> +;; and vector type instructions.
> +(define_cpu_unit "znver5-decode0" "znver5") (define_cpu_unit
> +"znver5-decode1" "znver5") (define_cpu_unit "znver5-decode2"
> +"znver5") (define_cpu_unit "znver5-decode3" "znver5")

Duplicating znver4 description to znver5 before scheduler description is tuned 
is basically just leads to increasing compiler binary size (scheduler models 
are quite large).

Depending on changes between generations, I think we should try to share CPU 
unit DFAs where it makes sense (i.e. shared DFA is smaller than two DFAs).  So 
perhaps unit scheduler is tuned, we can just change znver4.md to also work for 
znver5?

Honza


0001-Add-AMD-znver5-processor-enablement-with-scheduler-model.patch
Description: 0001-Add-AMD-znver5-processor-enablem

Re: [PATCH] c: Handle scoped attributes in __has*attribute and scoped attribute parsing changes in -std=c11 etc. modes [PR114007]

2024-02-22 Thread Joseph Myers
On Thu, 22 Feb 2024, Jakub Jelinek wrote:

> But sure, if you prefer the COLON_SCOPE version of the patch, I can commit
> that.  There is no PREV_WHITE in the preprocessor, there is

Yes, I prefer the COLON_SCOPE version.

-- 
Joseph S. Myers
josmy...@redhat.com



Re: [PATCH v1 03/13] aarch64: Mark x18 register as a fixed register for MS ABI

2024-02-22 Thread Iain Sandoe



> On 22 Feb 2024, at 17:45, Andrew Pinski  wrote:
> 
> On Thu, Feb 22, 2024 at 3:56 AM Richard Earnshaw (lists)
>  wrote:
>> 
>> On 21/02/2024 18:30, Evgeny Karpov wrote:
>>> 
>> +/* X18 reserved for the TEB on Windows.  */
>> +#ifdef TARGET_ARM64_MS_ABI
>> +# define FIXED_X18 1
>> +# define CALL_USED_X18 0
>> +#else
>> +# define FIXED_X18 0
>> +# define CALL_USED_X18 1
>> +#endif
>> 
>> I'm not overly keen on ifdefs like this (and the one below), it can get 
>> quite confusing if we have to support more than a couple of ABIs.  Perhaps 
>> we could create a couple of new headers, one for the EABI (which all 
>> existing targets would then need to include) and one for the MS ABI.  Then 
>> the mingw port would use that instead of the EABI header.
>> 
>> An alternative is to make all this dynamic, based on the setting of the 
>> aarch64_calling_abi enum and to make the adjustments in 
>> aarch64_conditional_register_usage.
> 
> Dynamically might be needed also if we want to support ms_abi
> attribute and/or -mabi=ms to support the wine folks.

X18 is also reserved on Darwin - in my current branch I have it non-dynamic too.
Iain

> 
> Thanks,
> Andrew Pinski
> 
>> 
>> +# define CALL_USED_X18 0
>> 
>> Is that really correct?  If the register is really reserved, but some code 
>> modifies it anyway, this will cause the compiler to restore the old value at 
>> the end of a function; generally, for a reserved register, code that knows 
>> what it's doing would want to make permanent changes to this value.
>> 
>> +#ifdef TARGET_ARM64_MS_ABI
>> +# define STATIC_CHAIN_REGNUM   R17_REGNUM
>> +#else
>> +# define STATIC_CHAIN_REGNUM   R18_REGNUM
>> +#endif
>> 
>> If we went the enum way, we'd want something like
>> 
>> #define STATIC_CHAIN_REGNUM (calling_abi == AARCH64_CALLING_ABI_MS ? 
>> R17_REGNUM : R18_REGNUM)
>> 
>> R.



Re: [PATCH v1 00/13] Add aarch64-w64-mingw32 target

2024-02-22 Thread Mark Harmstone

Hi all,

Seems to work for me! Nice work.

It also works nicely with EFI as well, for anyone interested:

test.c:

#include 

EFI_STATUS efi_main(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE* SystemTable) {
SystemTable->ConOut->OutputString(SystemTable->ConOut, L"hello, world\r\n");

return EFI_SUCCESS;
}

$ aarch64-w64-mingw32-gcc -I/usr/include/efi -nostartfiles -Wl,--subsystem,10 
-eefi_main test.c -o test.efi

Mark

On 21/2/24 17:47, Evgeny Karpov wrote:

Hello,

We would like to take your attention to the review of changes for the
new GCC target, aarch64-w64-mingw32. The new target will be
supported, tested, added to CI, and maintained by Linaro. This marks
the first of three planned patch series contributing to the GCC C
compiler's support for Windows Arm64.

1. Minimal aarch64-w64-mingw32 C implementation to cross-compile
hello-world with libgcc for Windows Arm64 using MinGW.
2. Extension of the aarch64-w64-mingw32 C implementation to
cross-compile OpenSSL, OpenBLAS, FFmpeg, and libjpeg-turbo. All
packages successfully pass tests.
3. Addition of call stack support for debugging, resolution of
optimization issues in the C compiler, and DLL export/import for the
aarch64-w64-mingw32 target.

This patch series introduces the 1st point, which involves building
hello-world for the aarch64-w64-mingw32 target. The patch depends on
the binutils changes for the aarch64-w64-mingw32 target that have
already been merged.

The binutils should include recent relocation fixes.
f87eaf8ff3995a5888c6dc4996a20c770e6bcd36
aarch64: Add new relocations and limit COFF AArch64 relocation offsets

The series is structured in a way to trivially show that it should not
affect any other targets.

In this patch, several changes have been made to support the
aarch64-w64-mingw32 target for GCC. The modifications include the
definition of the MS ABI for aarch64, adjustments to FIXED_REGISTERS
and STATIC_CHAIN_REGNUM for different ABIs, and specific definitions
for COFF format on AArch64. Additionally, the patch reuses MinGW
 types and definitions from i386, relocating them to a new
mingw folder for shared usage between both targets.

MinGW-specific options have been introduced for AArch64, along with
override options for aarch64-w64-mingw32. Builtin stack probing for
override options for aarch64-w64-mingw32. Builtin stack probing for
AArch64 has been enabled as an alternative for chkstk. Symbol name
encoding and section information handling for aarch64-w64-mingw32 have
been incorporated, and the MinGW environment has been added, which
will also be utilized for defining the Cygwin environment in the
future.

The patch includes renaming "x86 Windows Options" to "Cygwin and MinGW
Options," which now encompasses AArch64 as well. AArch64-specific
Cygwin and MinGW Options have been introduced for the unique
requirements of the AArch64 architecture.

Function type declaration and named sections support have been added.
The necessary objects for Cygwin and MinGW have been built for the
aarch64-w64-mingw32 target, and relevant files such as msformat-c.cc
and winnt-d.cc have been moved to the mingw folder for reuse in
AArch64.

Furthermore, the aarch64-w64-mingw32 target has been included in both
libatomic and libgcc, ensuring support for the AArch64 architecture
within these libraries. These changes collectively enhance the
capabilities of GCC for the specified target.

Coauthors: Zac Walker ,
Mark Harmstone   and
Ron Riddle 

Refactored, prepared, and validated by
Radek Barton  and
Evgeny Karpov 

Special thanks to the Linaro GNU toolchain team for internal review
and assistance in preparing the patch series!

Regards,
Evgeny


Zac Walker (13):
   Introduce aarch64-w64-mingw32 target
   aarch64: The aarch64-w64-mingw32 target implements the MS ABI
   aarch64: Mark x18 register as a fixed register for MS ABI
   aarch64: Add aarch64-w64-mingw32 COFF
   Reuse MinGW from i386 for AArch64
   Rename section and encoding functions from i386 which will be used in
 aarch64
   Exclude i386 functionality from aarch64 build
   aarch64: Add Cygwin and MinGW environments for AArch64
   aarch64: Add SEH to machine_function
   Rename "x86 Windows Options" to "Cygwin and MinGW Options"
   aarch64: Build and add objects for Cygwin and MinGW for AArch64
   aarch64: Add aarch64-w64-mingw32 target to libatomic
   Add aarch64-w64-mingw32 target to libgcc

  fixincludes/mkfixinc.sh   |   3 +-
  gcc/config.gcc|  47 +++--
  gcc/config/aarch64/aarch64-coff.h |  92 +
  gcc/config/aarch64/aarch64-opts.h |   7 +
  gcc/config/aarch64/aarch64-protos.h   |   5 +
  gcc/config/aarch64/aarch64.h  |  25 ++-
  gcc/config/aarch64/cygming.h  | 178 ++
  gcc/config/i386/cygming.h |  18 +-
  gcc/config/i386/cygming.opt.urls  |  30 ---
  gcc/config/i386/i386-protos.h |  12 +-
  gcc/config

Re: [PATCH] c: Handle scoped attributes in __has*attribute and scoped attribute parsing changes in -std=c11 etc. modes [PR114007]

2024-02-22 Thread Jakub Jelinek
On Thu, Feb 22, 2024 at 05:49:12PM +, Joseph Myers wrote:
> This patch (the one using COLON_SCOPE, *not* the one using PREV_WHITE) is 
> OK.
> 
> PREV_WHITE is about whether there is whitespace between the tokens in the 
> macro expansion, for the purposes of stringization - I don't think it's 
> appropriate to use here.  For example, given
> 
>   #define COLON() :
> 
> then
> 
>   [[gnu COLON()COLON() unused]] int x;
> 
> should preferably not be valid; stringizing the results of expanding 
> COLON()COLON() will produce "::" (PREV_WHITE not set), but that wouldn't 
> be a valid attribute in C23 and I don't think anyone could reasonably 
> expect it to be valid with previous standard versions.

It is not valid in either version of the patch, with any of -std=c11,
-std=gnu11 or -std=c23 with both compilers I get the same
/tmp/test.c:2:1: warning: ‘gnu’ attribute ignored [-Wattributes]
2 | [[gnu COLON()COLON() unused]] int x;
  | ^
/tmp/test.c:2:6: error: expected ‘]’ before ‘:’ token
2 | [[gnu COLON()COLON() unused]] int x;
  |  ^
  |  ]
message all the time.
But sure, if you prefer the COLON_SCOPE version of the patch, I can commit
that.  There is no PREV_WHITE in the preprocessor, there is
CPP_PADDING token in between the CPP_COLONs though and that translates
into PREV_WHITE on the second CPP_COLON in the FE token.

When trying to verify it, I've noticed one typo in the PREV_WHITE patch:
@@ -73,7 +73,7 @@ gcc/testsuite/
switch (type)
  {
  case CPP_PADDING:
-+  add_flags |= CPP_PADDING;
++  add_flags |= PREV_WHITE;
goto retry;
  
  case CPP_NAME:
but it doesn't change anything on the outcome of those tests nor your
testcase above nor the C++ -std=c++98 -fpermissive test.  CPP_PADDING
is 85 and PREV_WHITE 1, so it was setting PREV_WHITE too (just 3 other flags
too).

Jakub



Re: [PATCH RFA] build: drop target libs from LD_LIBRARY_PATH [PR105688]

2024-02-22 Thread Gaius Mulley
Iain Sandoe  writes:

> Right now, AFAIK the only target runtimes used by host tools are
> libstdc++, libgcc and libgnat.  I agree that might change with rust -
> since the rust folks are talking about using one of the runtimes in
> the FE, I am not aware of other language FEs requiring their targte
> runtimes to be available to the host tools (adding Gaius in case I
> missed something with m2 - which is quite complex inthe
> bootstrapping).

Hi Iain,

the m2 infrastructure translates and builds gcc/m2/gm2-libs along with
gcc/m2/gm2-compiler and uses these objects for cc1gm2, pge, mc etc -
rather than the library archives generated from /libgm2

regards,
Gaius


Re: [PATCH v1 05/13] Reuse MinGW from i386 for AArch64

2024-02-22 Thread Joseph Myers
On Thu, 22 Feb 2024, Richard Earnshaw (lists) wrote:

> On 21/02/2024 21:34, rep.dot@gmail.com wrote:
> > On 21 February 2024 19:34:43 CET, Evgeny Karpov 
> >  wrote:
> >>
> > 
> > Please use git send-email. Your mail ends up as empty as here, otherwise.
> 
> I don't see anything wrong with it; niether does patchwork 
> (https://patchwork.sourceware.org/project/gcc/list/?series=31191) nor 
> does the Linaro CI bot.  So perhaps it's your mailer that's 
> misconfigured.

The first part of the multipart/mixed message is empty (well, two blank 
lines).  The second part, the patch, has Content-Disposition: attachment, 
so it's correct for a mailer not to show it inline (instead showing an 
empty message with an attachment - which is how it appeared to me).

-- 
Joseph S. Myers
josmy...@redhat.com



Re: [PATCH] c: Handle scoped attributes in __has*attribute and scoped attribute parsing changes in -std=c11 etc. modes [PR114007]

2024-02-22 Thread Joseph Myers
On Thu, 22 Feb 2024, Jakub Jelinek wrote:

> Hi!
> 
> We aren't able to parse __has_attribute (vendor::attr) (and __has_c_attribute
> and __has_cpp_attribute) in strict C < C23 modes.  While in -std=gnu* modes
> or in -std=c23 there is CPP_SCOPE token, in -std=c* (except for -std=c23)
> there are is just a pair of CPP_COLON tokens.
> The c-lex.cc hunk adds support for that.
> 
> That leads to a question if we should return 1 or 0 from
> __has_attribute (gnu::unused) or not, because while
> [[gnu::unused]] is parsed fine in -std=gnu*/-std=c23 modes (sure, with
> pedwarn for < C23), we do not parse it at all in -std=c* (except for
> -std=c23), we only parse [[__extension__ gnu::unused]] there.  While
> the __extension__ in there helps to avoid the pedwarn, I think it is
> better to be consistent between GNU and strict C < C23 modes and
> parse [[gnu::unused]] too; on the other side, I think parsing
> [[__extension__ gnu : : unused]] is too weird and undesirable.
> 
> So, the following patch adds a flag during preprocessing at the point
> where we normally create CPP_SCOPE tokens out of 2 consecutive colons
> on the first CPP_COLON to mark the consecutive case (as we are tight
> on the bits, I've reused the PURE_ZERO flag, which is used just by the
> C++ FE and only ever set (both C and C++) on CPP_NUMBER tokens, this
> new flag has the same value and is only ever used on CPP_COLON tokens)
> and instead of checking loose_scope_p argument (i.e. whether it is
> [[__extension__ ...]] or not), it just parses CPP_SCOPE or CPP_COLON
> with CLONE_SCOPE flag followed by another CPP_COLON the same.
> The latter will never appear in >= C23 or -std=gnu* modes, though
> guarding its use say with flag_iso && !flag_isoc23 && doesn't really
> work because the __extension__ case temporarily clears flag_iso flag.

This patch (the one using COLON_SCOPE, *not* the one using PREV_WHITE) is 
OK.

PREV_WHITE is about whether there is whitespace between the tokens in the 
macro expansion, for the purposes of stringization - I don't think it's 
appropriate to use here.  For example, given

  #define COLON() :

then

  [[gnu COLON()COLON() unused]] int x;

should preferably not be valid; stringizing the results of expanding 
COLON()COLON() will produce "::" (PREV_WHITE not set), but that wouldn't 
be a valid attribute in C23 and I don't think anyone could reasonably 
expect it to be valid with previous standard versions.

-- 
Joseph S. Myers
josmy...@redhat.com



Re: [PATCH v1 03/13] aarch64: Mark x18 register as a fixed register for MS ABI

2024-02-22 Thread Andrew Pinski
On Thu, Feb 22, 2024 at 3:56 AM Richard Earnshaw (lists)
 wrote:
>
> On 21/02/2024 18:30, Evgeny Karpov wrote:
> >
> +/* X18 reserved for the TEB on Windows.  */
> +#ifdef TARGET_ARM64_MS_ABI
> +# define FIXED_X18 1
> +# define CALL_USED_X18 0
> +#else
> +# define FIXED_X18 0
> +# define CALL_USED_X18 1
> +#endif
>
> I'm not overly keen on ifdefs like this (and the one below), it can get quite 
> confusing if we have to support more than a couple of ABIs.  Perhaps we could 
> create a couple of new headers, one for the EABI (which all existing targets 
> would then need to include) and one for the MS ABI.  Then the mingw port 
> would use that instead of the EABI header.
>
> An alternative is to make all this dynamic, based on the setting of the 
> aarch64_calling_abi enum and to make the adjustments in 
> aarch64_conditional_register_usage.

Dynamically might be needed also if we want to support ms_abi
attribute and/or -mabi=ms to support the wine folks.

Thanks,
Andrew Pinski

>
> +# define CALL_USED_X18 0
>
> Is that really correct?  If the register is really reserved, but some code 
> modifies it anyway, this will cause the compiler to restore the old value at 
> the end of a function; generally, for a reserved register, code that knows 
> what it's doing would want to make permanent changes to this value.
>
> +#ifdef TARGET_ARM64_MS_ABI
> +# define STATIC_CHAIN_REGNUM   R17_REGNUM
> +#else
> +# define STATIC_CHAIN_REGNUM   R18_REGNUM
> +#endif
>
> If we went the enum way, we'd want something like
>
> #define STATIC_CHAIN_REGNUM (calling_abi == AARCH64_CALLING_ABI_MS ? 
> R17_REGNUM : R18_REGNUM)
>
> R.


Re: [PATCH] libgccjit: Allow comparing aligned int types

2024-02-22 Thread Antoni Boucher
Thanks for the review.
Here's the updated patch.

On Wed, 2024-01-24 at 12:18 -0500, David Malcolm wrote:
> On Thu, 2023-12-21 at 08:33 -0500, Antoni Boucher wrote:
> > Hi.
> > This patch allows comparing aligned integer types as equal.
> > There's a TODO in the code about whether we should check that the
> > alignment is equal.
> > What are your thoughts on this?
> 
> I think we should check for equal alignment.
> 
> [...snip...]
> 
> > diff --git a/gcc/testsuite/jit.dg/test-types.c
> > b/gcc/testsuite/jit.dg/test-types.c
> > index a01944e35fa..c2f4d2bcb3d 100644
> > --- a/gcc/testsuite/jit.dg/test-types.c
> > +++ b/gcc/testsuite/jit.dg/test-types.c
> > @@ -485,11 +485,15 @@ verify_code (gcc_jit_context *ctxt,
> > gcc_jit_result *result)
> >  
> >    CHECK_VALUE (z.m_FILE_ptr, stderr);
> >  
> > +  gcc_jit_type *long_type = gcc_jit_context_get_type (ctxt,
> > GCC_JIT_TYPE_LONG);
> > +  gcc_jit_type *int64_type = gcc_jit_context_get_type (ctxt,
> > GCC_JIT_TYPE_INT64_T);
> >    if (sizeof(long) == 8)
> > -    CHECK (gcc_jit_compatible_types (
> > -  gcc_jit_context_get_type (ctxt, GCC_JIT_TYPE_LONG),
> > -  gcc_jit_context_get_type (ctxt, GCC_JIT_TYPE_INT64_T)));
> > +    CHECK (gcc_jit_compatible_types (long_type, int64_type));
> >  
> >    CHECK_VALUE (gcc_jit_type_get_size (gcc_jit_context_get_type
> > (ctxt, GCC_JIT_TYPE_FLOAT)), sizeof (float));
> >    CHECK_VALUE (gcc_jit_type_get_size (gcc_jit_context_get_type
> > (ctxt, GCC_JIT_TYPE_DOUBLE)), sizeof (double));
> > +
> > +  gcc_jit_type *aligned_long = gcc_jit_type_get_aligned
> > (long_type, 4);
> > +  gcc_jit_type *aligned_int64 = gcc_jit_type_get_aligned
> > (int64_type, 4);
> > +  CHECK (gcc_jit_compatible_types (aligned_long, aligned_int64));
> 
> This CHECK should be guarded on sizeof(long) == 8 like the check
> above.
> 
> 
> Dave
> 

From 899a0555fa1b1796d29b59a6a41db854c46f6c09 Mon Sep 17 00:00:00 2001
From: Antoni Boucher 
Date: Sun, 8 Oct 2023 09:12:12 -0400
Subject: [PATCH] libgccjit: Allow comparing aligned int types

gcc/jit/ChangeLog:

	* jit-common.h: Add forward declaration of memento_of_get_aligned.
	* jit-recording.h (type::is_same_type_as): Compare integer
	types.
	(dyn_cast_aligned_type): New method.
	(type::is_aligned, memento_of_get_aligned::is_same_type_as,
	memento_of_get_aligned::is_aligned): new methods.

gcc/testsuite/ChangeLog:

	* jit.dg/test-types.c: Add checks comparing aligned types.
---
 gcc/jit/jit-common.h  |  1 +
 gcc/jit/jit-recording.h   | 30 +++---
 gcc/testsuite/jit.dg/test-types.c | 11 ---
 3 files changed, 32 insertions(+), 10 deletions(-)

diff --git a/gcc/jit/jit-common.h b/gcc/jit/jit-common.h
index 1e335878b56..53a6dcce79f 100644
--- a/gcc/jit/jit-common.h
+++ b/gcc/jit/jit-common.h
@@ -133,6 +133,7 @@ namespace recording {
 class statement;
   class extended_asm;
 class case_;
+class memento_of_get_aligned;
   class top_level_asm;
 
   /* End of recording types. */
diff --git a/gcc/jit/jit-recording.h b/gcc/jit/jit-recording.h
index d8d16f4fe29..9af952cc217 100644
--- a/gcc/jit/jit-recording.h
+++ b/gcc/jit/jit-recording.h
@@ -552,6 +552,7 @@ public:
   virtual function_type *as_a_function_type() { gcc_unreachable (); return NULL; }
   virtual struct_ *dyn_cast_struct () { return NULL; }
   virtual vector_type *dyn_cast_vector_type () { return NULL; }
+  virtual memento_of_get_aligned *dyn_cast_aligned_type () { return NULL; }
 
   /* Is it typesafe to copy to this type from rtype?  */
   virtual bool accepts_writes_from (type *rtype)
@@ -562,6 +563,14 @@ public:
 
   virtual bool is_same_type_as (type *other)
   {
+if (is_int ()
+		 && other->is_int ()
+		 && get_size () == other->get_size ()
+		 && is_signed () == other->is_signed ())
+{
+  /* LHS (this) is an integer of the same size and sign as rtype.  */
+  return true;
+}
 return this == other;
   }
 
@@ -579,6 +588,7 @@ public:
   virtual type *is_volatile () { return NULL; }
   virtual type *is_restrict () { return NULL; }
   virtual type *is_const () { return NULL; }
+  virtual type *is_aligned () { return NULL; }
   virtual type *is_array () = 0;
   virtual struct_ *is_struct () { return NULL; }
   virtual bool is_union () const { return false; }
@@ -633,13 +643,6 @@ public:
 	   accept it:  */
 	return true;
 	  }
-  } else if (is_int ()
-		 && rtype->is_int ()
-		 && get_size () == rtype->get_size ()
-		 && is_signed () == rtype->is_signed ())
-  {
-	/* LHS (this) is an integer of the same size and sign as rtype.  */
-	return true;
   }
 
 return type::accepts_writes_from (rtype);
@@ -816,10 +819,23 @@ public:
   : decorated_type (other_type),
 m_alignment_in_bytes (alignment_in_bytes) {}
 
+  bool is_same_type_as (type *other) final override
+  {
+if (!other->is_aligned ())
+{
+  return m_other_type->is_same_type_as (other);
+}
+return m_alignment_in_bytes == other->dyn_cast_aligned_type ()->m_

[PATCH v5 5/5] arm: Add support for MVE Tail-Predicated Low Overhead Loops

2024-02-22 Thread Andre Vieira

This patch adds support for MVE Tail-Predicated Low Overhead Loops by using the
doloop funcitonality added to support predicated vectorized hardware loops.

gcc/ChangeLog:

* config/arm/arm-protos.h (arm_target_bb_ok_for_lob): Change
declaration to pass basic_block.
(arm_attempt_dlstp_transform): New declaration.
* config/arm/arm.cc (TARGET_LOOP_UNROLL_ADJUST): Define targethook.
(TARGET_PREDICT_DOLOOP_P): Likewise.
(arm_target_bb_ok_for_lob): Adapt condition.
(arm_mve_get_vctp_lanes): New function.
(arm_dl_usage_type): New internal enum.
(arm_get_required_vpr_reg): New function.
(arm_get_required_vpr_reg_param): New function.
(arm_get_required_vpr_reg_ret_val): New function.
(arm_mve_get_loop_vctp): New function.
(arm_mve_insn_predicated_by): New function.
(arm_mve_across_lane_insn_p): New function.
(arm_mve_load_store_insn_p): New function.
(arm_mve_impl_pred_on_outputs_p): New function.
(arm_mve_impl_pred_on_inputs_p): New function.
(arm_last_vect_def_insn): New function.
(arm_mve_impl_predicated_p): New function.
(arm_mve_check_reg_origin_is_num_elems): New function.
(arm_mve_dlstp_check_inc_counter): New function.
(arm_mve_dlstp_check_dec_counter): New function.
(arm_mve_loop_valid_for_dlstp): New function.
(arm_predict_doloop_p): New function.
(arm_loop_unroll_adjust): New function.
(arm_emit_mve_unpredicated_insn_to_seq): New function.
(arm_attempt_dlstp_transform): New function.
* config/arm/arm.opt (mdlstp): New option.
* config/arm/iteratords.md (dlstp_elemsize, letp_num_lanes,
letp_num_lanes_neg, letp_num_lanes_minus_1): New attributes.
(DLSTP, LETP): New iterators.
(predicated_doloop_end_internal): New pattern.
(dlstp_insn): New pattern.
* config/arm/thumb2.md (doloop_end): Adapt to support tail-predicated
loops.
(doloop_begin): Likewise.
* config/arm/types.md (mve_misc): New mve type to represent
predicated_loop_end insn sequences.
* config/arm/unspecs.md:
(DLSTP8, DLSTP16, DLSTP32, DSLTP64,
LETP8, LETP16, LETP32, LETP64): New unspecs for DLSTP and LETP.

gcc/testsuite/ChangeLog:

* gcc.target/arm/lob.h: Add new helpers.
* gcc.target/arm/lob1.c: Use new helpers.
* gcc.target/arm/lob6.c: Likewise.
* gcc.target/arm/dlstp-compile-asm-1.c: New test.
* gcc.target/arm/dlstp-compile-asm-2.c: New test.
* gcc.target/arm/dlstp-compile-asm-3.c: New test.
* gcc.target/arm/dlstp-int8x16.c: New test.
* gcc.target/arm/dlstp-int8x16-run.c: New test.
* gcc.target/arm/dlstp-int16x8.c: New test.
* gcc.target/arm/dlstp-int16x8-run.c: New test.
* gcc.target/arm/dlstp-int32x4.c: New test.
* gcc.target/arm/dlstp-int32x4-run.c: New test.
* gcc.target/arm/dlstp-int64x2.c: New test.
* gcc.target/arm/dlstp-int64x2-run.c: New test.
* gcc.target/arm/dlstp-invalid-asm.c: New test.

Co-authored-by: Stam Markianos-Wright 
---
 gcc/config/arm/arm-protos.h   |4 +-
 gcc/config/arm/arm.cc | 1249 -
 gcc/config/arm/arm.opt|3 +
 gcc/config/arm/iterators.md   |   15 +
 gcc/config/arm/mve.md |   50 +
 gcc/config/arm/thumb2.md  |  138 +-
 gcc/config/arm/types.md   |6 +-
 gcc/config/arm/unspecs.md |   14 +-
 gcc/testsuite/gcc.target/arm/lob.h|  128 +-
 gcc/testsuite/gcc.target/arm/lob1.c   |   23 +-
 gcc/testsuite/gcc.target/arm/lob6.c   |8 +-
 .../gcc.target/arm/mve/dlstp-compile-asm-1.c  |  146 ++
 .../gcc.target/arm/mve/dlstp-compile-asm-2.c  |  749 ++
 .../gcc.target/arm/mve/dlstp-compile-asm-3.c  |   46 +
 .../gcc.target/arm/mve/dlstp-int16x8-run.c|   44 +
 .../gcc.target/arm/mve/dlstp-int16x8.c|   31 +
 .../gcc.target/arm/mve/dlstp-int32x4-run.c|   45 +
 .../gcc.target/arm/mve/dlstp-int32x4.c|   31 +
 .../gcc.target/arm/mve/dlstp-int64x2-run.c|   48 +
 .../gcc.target/arm/mve/dlstp-int64x2.c|   28 +
 .../gcc.target/arm/mve/dlstp-int8x16-run.c|   44 +
 .../gcc.target/arm/mve/dlstp-int8x16.c|   32 +
 .../gcc.target/arm/mve/dlstp-invalid-asm.c|  521 +++
 23 files changed, 3321 insertions(+), 82 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-compile-asm-1.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-compile-asm-2.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-compile-asm-3.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-int16x8-run.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-int16x8.c
 create mode 100644 gcc/testsuite/gcc.target

[PATCH v5 1/5] arm: Add define_attr to to create a mapping between MVE predicated and unpredicated insns

2024-02-22 Thread Andre Vieira

This patch adds an attribute to the mve md patterns to be able to identify
predicable MVE instructions and what their predicated and unpredicated variants
are.  This attribute is used to encode the icode of the unpredicated variant of
an instruction in its predicated variant.

This will make it possible for us to transform VPT-predicated insns in
the insn chain into their unpredicated equivalents when transforming the loop
into a MVE Tail-Predicated Low Overhead Loop. For example:
`mve_vldrbq_z_ -> mve_vldrbq_`.

gcc/ChangeLog:

* config/arm/arm.md (mve_unpredicated_insn): New attribute.
* config/arm/arm.h (MVE_VPT_PREDICATED_INSN_P): New define.
(MVE_VPT_UNPREDICATED_INSN_P): Likewise.
(MVE_VPT_PREDICABLE_INSN_P): Likewise.
* config/arm/vec-common.md (mve_vshlq_): Add attribute.
* config/arm/mve.md (arm_vcx1q_p_v16qi): Add attribute.
(arm_vcx1qv16qi): Likewise.
(arm_vcx1qav16qi): Likewise.
(arm_vcx1qv16qi): Likewise.
(arm_vcx2q_p_v16qi): Likewise.
(arm_vcx2qv16qi): Likewise.
(arm_vcx2qav16qi): Likewise.
(arm_vcx2qv16qi): Likewise.
(arm_vcx3q_p_v16qi): Likewise.
(arm_vcx3qv16qi): Likewise.
(arm_vcx3qav16qi): Likewise.
(arm_vcx3qv16qi): Likewise.
(@mve_q_): Likewise.
(@mve_q_int_): Likewise.
(@mve_q_v4si): Likewise.
(@mve_q_n_): Likewise.
(@mve_q_r_): Likewise.
(@mve_q_f): Likewise.
(@mve_q_m_): Likewise.
(@mve_q_m_n_): Likewise.
(@mve_q_m_r_): Likewise.
(@mve_q_m_f): Likewise.
(@mve_q_int_m_): Likewise.
(@mve_q_p_v4si): Likewise.
(@mve_q_p_): Likewise.
(@mve_q_): Likewise.
(@mve_q_f): Likewise.
(@mve_q_m_): Likewise.
(@mve_q_m_f): Likewise.
(mve_vq_f): Likewise.
(mve_q): Likewise.
(mve_q_f): Likewise.
(mve_vadciq_v4si): Likewise.
(mve_vadciq_m_v4si): Likewise.
(mve_vadcq_v4si): Likewise.
(mve_vadcq_m_v4si): Likewise.
(mve_vandq_): Likewise.
(mve_vandq_f): Likewise.
(mve_vandq_m_): Likewise.
(mve_vandq_m_f): Likewise.
(mve_vandq_s): Likewise.
(mve_vandq_u): Likewise.
(mve_vbicq_): Likewise.
(mve_vbicq_f): Likewise.
(mve_vbicq_m_): Likewise.
(mve_vbicq_m_f): Likewise.
(mve_vbicq_m_n_): Likewise.
(mve_vbicq_n_): Likewise.
(mve_vbicq_s): Likewise.
(mve_vbicq_u): Likewise.
(@mve_vclzq_s): Likewise.
(mve_vclzq_u): Likewise.
(@mve_vcmp_q_): Likewise.
(@mve_vcmp_q_n_): Likewise.
(@mve_vcmp_q_f): Likewise.
(@mve_vcmp_q_n_f): Likewise.
(@mve_vcmp_q_m_f): Likewise.
(@mve_vcmp_q_m_n_): Likewise.
(@mve_vcmp_q_m_): Likewise.
(@mve_vcmp_q_m_n_f): Likewise.
(mve_vctpq): Likewise.
(mve_vctpq_m): Likewise.
(mve_vcvtaq_): Likewise.
(mve_vcvtaq_m_): Likewise.
(mve_vcvtbq_f16_f32v8hf): Likewise.
(mve_vcvtbq_f32_f16v4sf): Likewise.
(mve_vcvtbq_m_f16_f32v8hf): Likewise.
(mve_vcvtbq_m_f32_f16v4sf): Likewise.
(mve_vcvtmq_): Likewise.
(mve_vcvtmq_m_): Likewise.
(mve_vcvtnq_): Likewise.
(mve_vcvtnq_m_): Likewise.
(mve_vcvtpq_): Likewise.
(mve_vcvtpq_m_): Likewise.
(mve_vcvtq_from_f_): Likewise.
(mve_vcvtq_m_from_f_): Likewise.
(mve_vcvtq_m_n_from_f_): Likewise.
(mve_vcvtq_m_n_to_f_): Likewise.
(mve_vcvtq_m_to_f_): Likewise.
(mve_vcvtq_n_from_f_): Likewise.
(mve_vcvtq_n_to_f_): Likewise.
(mve_vcvtq_to_f_): Likewise.
(mve_vcvttq_f16_f32v8hf): Likewise.
(mve_vcvttq_f32_f16v4sf): Likewise.
(mve_vcvttq_m_f16_f32v8hf): Likewise.
(mve_vcvttq_m_f32_f16v4sf): Likewise.
(mve_vdwdupq_m_wb_u_insn): Likewise.
(mve_vdwdupq_wb_u_insn): Likewise.
(mve_veorq_s>): Likewise.
(mve_veorq_u>): Likewise.
(mve_veorq_f): Likewise.
(mve_vidupq_m_wb_u_insn): Likewise.
(mve_vidupq_u_insn): Likewise.
(mve_viwdupq_m_wb_u_insn): Likewise.
(mve_viwdupq_wb_u_insn): Likewise.
(mve_vldrbq_): Likewise.
(mve_vldrbq_gather_offset_): Likewise.
(mve_vldrbq_gather_offset_z_): Likewise.
(mve_vldrbq_z_): Likewise.
(mve_vldrdq_gather_base_v2di): Likewise.
(mve_vldrdq_gather_base_wb_v2di_insn): Likewise.
(mve_vldrdq_gather_base_wb_z_v2di_insn): Likewise.
(mve_vldrdq_gather_base_z_v2di): Likewise.
(mve_vldrdq_gather_offset_v2di): Likewise.
(mve_vldrdq_gather_offset_z_v2di): Likewise.
(mve_vldrdq_gather_shifted_offset_v2di): Likewise.
(mve_vldrdq_gather_shifted_offset_z_v2di): Likewise.
(mve_vldrhq_): Likewise.
(mve_vldrhq_fv8hf): Likewise.
(mve_vldrhq_gather_offset_): Likewi

[PATCH v5 4/5] arm: Fix a wrong attribute use and remove unused unspecs and iterators

2024-02-22 Thread Andre Vieira

This patch fixes the erroneous use of a mode attribute without a mode iterator
in the pattern and removes unused unspecs and iterators.

gcc/ChangeLog:

* config/arm/iterators.md (supf): Remove VMLALDAVXQ_U, VMLALDAVXQ_P_U,
VMLALDAVAXQ_U cases.
(VMLALDAVXQ): Remove iterator.
(VMLALDAVXQ_P): Likewise.
(VMLALDAVAXQ): Likewise.
* config/arm/mve.md (mve_vstrwq_p_fv4sf): Replace use of 
mode iterator attribute with V4BI mode.
* config/arm/unspecs.md (VMLALDAVXQ_U, VMLALDAVXQ_P_U,
VMLALDAVAXQ_U): Remove unused unspecs.
---
 gcc/config/arm/iterators.md | 9 +++--
 gcc/config/arm/mve.md   | 2 +-
 gcc/config/arm/unspecs.md   | 3 ---
 3 files changed, 4 insertions(+), 10 deletions(-)

diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index 22b3ddf5637..3206bcab4cf 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -2370,7 +2370,7 @@ (define_int_attr supf [(VCVTQ_TO_F_S "s") (VCVTQ_TO_F_U "u") (VREV16Q_S "s")
 		   (VSUBQ_S "s") (VSUBQ_U "u") (VADDVAQ_S "s")
 		   (VADDVAQ_U "u") (VADDLVAQ_S "s") (VADDLVAQ_U "u")
 		   (VBICQ_N_S "s") (VBICQ_N_U "u") (VMLALDAVQ_U "u")
-		   (VMLALDAVQ_S "s") (VMLALDAVXQ_U "u") (VMLALDAVXQ_S "s")
+		   (VMLALDAVQ_S "s") (VMLALDAVXQ_S "s")
 		   (VMOVNBQ_U "u") (VMOVNBQ_S "s") (VMOVNTQ_U "u")
 		   (VMOVNTQ_S "s") (VORRQ_N_S "s") (VORRQ_N_U "u")
 		   (VQMOVNBQ_U "u") (VQMOVNBQ_S "s") (VQMOVNTQ_S "s")
@@ -2412,8 +2412,8 @@ (define_int_attr supf [(VCVTQ_TO_F_S "s") (VCVTQ_TO_F_U "u") (VREV16Q_S "s")
 		   (VREV16Q_M_S "s") (VREV16Q_M_U "u")
 		   (VQRSHRNTQ_N_U "u") (VMOVNTQ_M_U "u") (VMOVLBQ_M_U "u")
 		   (VMLALDAVAQ_U "u") (VQSHRNBQ_N_U "u") (VSHRNBQ_N_U "u")
-		   (VRSHRNBQ_N_U "u") (VMLALDAVXQ_P_U "u")
-		   (VMVNQ_M_N_U "u") (VQSHRNTQ_N_U "u") (VMLALDAVAXQ_U "u")
+		   (VRSHRNBQ_N_U "u")
+		   (VMVNQ_M_N_U "u") (VQSHRNTQ_N_U "u")
 		   (VQMOVNTQ_M_U "u") (VSHRNTQ_N_U "u") (VCVTMQ_M_S "s")
 		   (VCVTMQ_M_U "u") (VCVTNQ_M_S "s") (VCVTNQ_M_U "u")
 		   (VCVTPQ_M_S "s") (VCVTPQ_M_U "u") (VADDLVAQ_P_S "s")
@@ -2762,7 +2762,6 @@ (define_int_iterator VSUBQ_N [VSUBQ_N_S VSUBQ_N_U])
 (define_int_iterator VADDLVAQ [VADDLVAQ_S VADDLVAQ_U])
 (define_int_iterator VBICQ_N [VBICQ_N_S VBICQ_N_U])
 (define_int_iterator VMLALDAVQ [VMLALDAVQ_U VMLALDAVQ_S])
-(define_int_iterator VMLALDAVXQ [VMLALDAVXQ_U VMLALDAVXQ_S])
 (define_int_iterator VMOVNBQ [VMOVNBQ_U VMOVNBQ_S])
 (define_int_iterator VMOVNTQ [VMOVNTQ_S VMOVNTQ_U])
 (define_int_iterator VORRQ_N [VORRQ_N_U VORRQ_N_S])
@@ -2817,11 +2816,9 @@ (define_int_iterator VMLALDAVAQ [VMLALDAVAQ_S VMLALDAVAQ_U])
 (define_int_iterator VQSHRNBQ_N [VQSHRNBQ_N_U VQSHRNBQ_N_S])
 (define_int_iterator VSHRNBQ_N [VSHRNBQ_N_U VSHRNBQ_N_S])
 (define_int_iterator VRSHRNBQ_N [VRSHRNBQ_N_S VRSHRNBQ_N_U])
-(define_int_iterator VMLALDAVXQ_P [VMLALDAVXQ_P_U VMLALDAVXQ_P_S])
 (define_int_iterator VQMOVNTQ_M [VQMOVNTQ_M_U VQMOVNTQ_M_S])
 (define_int_iterator VMVNQ_M_N [VMVNQ_M_N_U VMVNQ_M_N_S])
 (define_int_iterator VQSHRNTQ_N [VQSHRNTQ_N_U VQSHRNTQ_N_S])
-(define_int_iterator VMLALDAVAXQ [VMLALDAVAXQ_S VMLALDAVAXQ_U])
 (define_int_iterator VSHRNTQ_N [VSHRNTQ_N_S VSHRNTQ_N_U])
 (define_int_iterator VCVTMQ_M [VCVTMQ_M_S VCVTMQ_M_U])
 (define_int_iterator VCVTNQ_M [VCVTNQ_M_S VCVTNQ_M_U])
diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index d7bdcd862f8..9fe51298cdc 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -4605,7 +4605,7 @@ (define_insn "mve_vstrwq_p_fv4sf"
   [(set (match_operand:V4SI 0 "mve_memory_operand" "=Ux")
 	(unspec:V4SI
 	 [(match_operand:V4SF 1 "s_register_operand" "w")
-	  (match_operand: 2 "vpr_register_operand" "Up")
+	  (match_operand:V4BI 2 "vpr_register_operand" "Up")
 	  (match_dup 0)]
 	 VSTRWQ_F))]
   "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT"
diff --git a/gcc/config/arm/unspecs.md b/gcc/config/arm/unspecs.md
index b9db306c067..46ac8b37157 100644
--- a/gcc/config/arm/unspecs.md
+++ b/gcc/config/arm/unspecs.md
@@ -717,7 +717,6 @@ (define_c_enum "unspec" [
   VCVTBQ_F16_F32
   VCVTTQ_F16_F32
   VMLALDAVQ_U
-  VMLALDAVXQ_U
   VMLALDAVXQ_S
   VMLALDAVQ_S
   VMLSLDAVQ_S
@@ -934,7 +933,6 @@ (define_c_enum "unspec" [
   VSHRNBQ_N_S
   VRSHRNBQ_N_S
   VRSHRNBQ_N_U
-  VMLALDAVXQ_P_U
   VMLALDAVXQ_P_S
   VQMOVNTQ_M_U
   VQMOVNTQ_M_S
@@ -943,7 +941,6 @@ (define_c_enum "unspec" [
   VQSHRNTQ_N_U
   VQSHRNTQ_N_S
   VMLALDAVAXQ_S
-  VMLALDAVAXQ_U
   VSHRNTQ_N_S
   VSHRNTQ_N_U
   VCVTBQ_M_F16_F32


[PATCH v5 3/5] arm: Annotate instructions with mve_safe_imp_xlane_pred

2024-02-22 Thread Andre Vieira

This patch annotates some MVE across lane instructions with a new attribute.
We use this attribute to let the compiler know that these instructions can be
safely implicitly predicated when tail predicating if their operands are
guaranteed to have zeroed tail predicated lanes.  These instructions were
selected because having the value 0 in those lanes or 'tail-predicating' those
lanes have the same effect.

gcc/ChangeLog:

* config/arm/arm.md (mve_safe_imp_xlane_pred): New attribute.
* config/arm/iterators.md (mve_vmaxmin_safe_imp): New iterator
attribute.
* config/arm/mve.md (vaddvq_s, vaddvq_u, vaddlvq_s, vaddlvq_u,
vaddvaq_s, vaddvaq_u, vmaxavq_s, vmaxvq_u, vmladavq_s, vmladavq_u,
vmladavxq_s, vmlsdavq_s, vmlsdavxq_s, vaddlvaq_s, vaddlvaq_u,
vmlaldavq_u, vmlaldavq_s, vmlaldavq_u, vmlaldavxq_s, vmlsldavq_s,
vmlsldavxq_s, vrmlaldavhq_u, vrmlaldavhq_s, vrmlaldavhxq_s,
vrmlsldavhq_s, vrmlsldavhxq_s, vrmlaldavhaq_s, vrmlaldavhaq_u,
vrmlaldavhaxq_s, vrmlsldavhaq_s, vrmlsldavhaxq_s, vabavq_s, vabavq_u,
vmladavaq_u, vmladavaq_s, vmladavaxq_s, vmlsdavaq_s, vmlsdavaxq_s,
vmlaldavaq_s, vmlaldavaq_u, vmlaldavaxq_s, vmlsldavaq_s,
vmlsldavaxq_s): Added mve_safe_imp_xlane_pred.
---
 gcc/config/arm/arm.md   |  6 ++
 gcc/config/arm/iterators.md |  8 
 gcc/config/arm/mve.md   | 12 
 3 files changed, 26 insertions(+)

diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 81290e83818..814e871acea 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -130,6 +130,12 @@ (define_attr "predicated" "yes,no" (const_string "no"))
 ; encode that it is a predicable instruction.
 (define_attr "mve_unpredicated_insn" "" (symbol_ref "CODE_FOR_nothing"))
 
+; An attribute used by the loop-doloop pass when determining whether it is
+; safe to predicate a MVE instruction, that operates across lanes, and was
+; previously not predicated.  The pass will still check whether all inputs
+; are predicated by the VCTP predication mask.
+(define_attr "mve_safe_imp_xlane_pred" "yes,no" (const_string "no"))
+
 ; LENGTH of an instruction (in bytes)
 (define_attr "length" ""
   (const_int 4))
diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index 7600bf62531..22b3ddf5637 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -869,6 +869,14 @@ (define_code_attr mve_addsubmul [
 		 (plus "vadd")
 		 ])
 
+(define_int_attr mve_vmaxmin_safe_imp [
+		 (VMAXVQ_U "yes")
+		 (VMAXVQ_S "no")
+		 (VMAXAVQ_S "yes")
+		 (VMINVQ_U "no")
+		 (VMINVQ_S "no")
+		 (VMINAVQ_S "no")])
+
 (define_int_attr mve_cmp_op1 [
 		 (VCMPCSQ_M_U "cs")
 		 (VCMPEQQ_M_S "eq") (VCMPEQQ_M_U "eq")
diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index 8aa0bded7f0..d7bdcd862f8 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -393,6 +393,7 @@ (define_insn "@mve_q_"
   "TARGET_HAVE_MVE"
   ".%#\t%0, %q1"
  [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_"))
+  (set_attr "mve_safe_imp_xlane_pred" "yes")
   (set_attr "type" "mve_move")
 ])
 
@@ -529,6 +530,7 @@ (define_insn "@mve_q_v4si"
   "TARGET_HAVE_MVE"
   ".32\t%Q0, %R0, %q1"
  [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_v4si"))
+  (set_attr "mve_safe_imp_xlane_pred" "yes")
   (set_attr "type" "mve_move")
 ])
 
@@ -802,6 +804,7 @@ (define_insn "@mve_q_"
   "TARGET_HAVE_MVE"
   ".%#\t%0, %q2"
  [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_"))
+  (set_attr "mve_safe_imp_xlane_pred" "yes")
   (set_attr "type" "mve_move")
 ])
 
@@ -1014,6 +1017,7 @@ (define_insn "@mve_q_"
   "TARGET_HAVE_MVE"
   ".%#\t%0, %q2"
  [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_"))
+  (set_attr "mve_safe_imp_xlane_pred" "")
   (set_attr "type" "mve_move")
 ])
 
@@ -1033,6 +1037,7 @@ (define_insn "@mve_q_"
   "TARGET_HAVE_MVE"
   ".%#\t%0, %q1, %q2"
  [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_"))
+  (set_attr "mve_safe_imp_xlane_pred" "yes")
   (set_attr "type" "mve_move")
 ])
 
@@ -1219,6 +1224,7 @@ (define_insn "@mve_q_v4si"
   "TARGET_HAVE_MVE"
   ".32\t%Q0, %R0, %q2"
  [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_v4si"))
+  (set_attr "mve_safe_imp_xlane_pred" "yes")
   (set_attr "type" "mve_move")
 ])
 
@@ -1450,6 +1456,7 @@ (define_insn "@mve_q_"
   "TARGET_HAVE_MVE"
   ".%#\t%Q0, %R0, %q1, %q2"
  [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_"))
+  (set_attr "mve_safe_imp_xlane_pred" "yes")
   (set_attr "type" "mve_move")
 ])
 
@@ -1588,6 +1595,7 @@ (define_insn "@mve_q_v4si"
   "TARGET_HAVE_MVE"
   ".32\t%Q0, %R0, %q1, %q2"
  [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_v4si"))
+  (set_attr "mve_safe_imp_xlane_pred" "yes")
   (set_attr "type" "mve_move")
 ])
 
@@ -1725,6 +1733,7 @@ (define_insn "@mve_q_v4si"
   "TARGET_HAVE_MVE"
   ".32\t%Q0, %R0, %q2, %q3"
  [(set (att

[PATCH v5 2/5] doloop: Add support for predicated vectorized loops

2024-02-22 Thread Andre Vieira

This patch adds support in the target agnostic doloop pass for the detection of
predicated vectorized hardware loops.  Arm is currently the only target that
will make use of this feature.

gcc/ChangeLog:

* df-core.cc (df_bb_regno_only_def_find): New helper function.
* df.h (df_bb_regno_only_def_find): Declare new function.
* loop-doloop.cc (doloop_condition_get): Add support for detecting
predicated vectorized hardware loops.
(doloop_modify): Add support for GTU condition checks.
(doloop_optimize): Update costing computation to support alterations to
desc->niter_expr by the backend.

Co-authored-by: Stam Markianos-Wright 
---
 gcc/df-core.cc |  15 +
 gcc/df.h   |   1 +
 gcc/loop-doloop.cc | 164 +++--
 3 files changed, 113 insertions(+), 67 deletions(-)

diff --git a/gcc/df-core.cc b/gcc/df-core.cc
index f0eb4c93957..b0e8a88d433 100644
--- a/gcc/df-core.cc
+++ b/gcc/df-core.cc
@@ -1964,6 +1964,21 @@ df_bb_regno_last_def_find (basic_block bb, unsigned int regno)
   return NULL;
 }
 
+/* Return the one and only def of REGNO within BB.  If there is no def or
+   there are multiple defs, return NULL.  */
+
+df_ref
+df_bb_regno_only_def_find (basic_block bb, unsigned int regno)
+{
+  df_ref temp = df_bb_regno_first_def_find (bb, regno);
+  if (!temp)
+return NULL;
+  else if (temp == df_bb_regno_last_def_find (bb, regno))
+return temp;
+  else
+return NULL;
+}
+
 /* Finds the reference corresponding to the definition of REG in INSN.
DF is the dataflow object.  */
 
diff --git a/gcc/df.h b/gcc/df.h
index 84e5aa8b524..c4e690b40cf 100644
--- a/gcc/df.h
+++ b/gcc/df.h
@@ -987,6 +987,7 @@ extern void df_check_cfg_clean (void);
 #endif
 extern df_ref df_bb_regno_first_def_find (basic_block, unsigned int);
 extern df_ref df_bb_regno_last_def_find (basic_block, unsigned int);
+extern df_ref df_bb_regno_only_def_find (basic_block, unsigned int);
 extern df_ref df_find_def (rtx_insn *, rtx);
 extern bool df_reg_defined (rtx_insn *, rtx);
 extern df_ref df_find_use (rtx_insn *, rtx);
diff --git a/gcc/loop-doloop.cc b/gcc/loop-doloop.cc
index 529e810e530..8953e1de960 100644
--- a/gcc/loop-doloop.cc
+++ b/gcc/loop-doloop.cc
@@ -85,10 +85,10 @@ doloop_condition_get (rtx_insn *doloop_pat)
  forms:
 
  1)  (parallel [(set (pc) (if_then_else (condition)
-	  			(label_ref (label))
-(pc)))
-	 (set (reg) (plus (reg) (const_int -1)))
-	 (additional clobbers and uses)])
+	(label_ref (label))
+	(pc)))
+		 (set (reg) (plus (reg) (const_int -1)))
+		 (additional clobbers and uses)])
 
  The branch must be the first entry of the parallel (also required
  by jump.cc), and the second entry of the parallel must be a set of
@@ -96,19 +96,33 @@ doloop_condition_get (rtx_insn *doloop_pat)
  the loop counter in an if_then_else too.
 
  2)  (set (reg) (plus (reg) (const_int -1))
- (set (pc) (if_then_else (reg != 0)
-	 (label_ref (label))
-			 (pc))).  
+	 (set (pc) (if_then_else (reg != 0)
+ (label_ref (label))
+ (pc))).
 
- Some targets (ARM) do the comparison before the branch, as in the
+ 3) Some targets (Arm) do the comparison before the branch, as in the
  following form:
 
- 3) (parallel [(set (cc) (compare ((plus (reg) (const_int -1), 0)))
-   (set (reg) (plus (reg) (const_int -1)))])
-(set (pc) (if_then_else (cc == NE)
-(label_ref (label))
-(pc))) */
-
+ (parallel [(set (cc) (compare (plus (reg) (const_int -1)) 0))
+		(set (reg) (plus (reg) (const_int -1)))])
+ (set (pc) (if_then_else (cc == NE)
+			 (label_ref (label))
+			 (pc)))
+
+  4) This form supports a construct that is used to represent a vectorized
+  do loop with predication, however we do not need to care about the
+  details of the predication here.
+  Arm uses this construct to support MVE tail predication.
+
+  (parallel
+   [(set (pc)
+	 (if_then_else (gtu (plus (reg) (const_int -n))
+(const_int n-1))
+			   (label_ref)
+			   (pc)))
+	(set (reg) (plus (reg) (const_int -n)))
+	(additional clobbers and uses)])
+ */
   pattern = PATTERN (doloop_pat);
 
   if (GET_CODE (pattern) != PARALLEL)
@@ -173,15 +187,17 @@ doloop_condition_get (rtx_insn *doloop_pat)
   if (! REG_P (reg))
 return 0;
 
-  /* Check if something = (plus (reg) (const_int -1)).
+  /* Check if something = (plus (reg) (const_int -n)).
  On IA-64, this decrement is wrapped in an if_then_else.  */
   inc_src = SET_SRC (inc);
   if (GET_CODE (inc_src) == IF_THEN_ELSE)
 inc_src = XEXP (inc_src, 1);
   if (GET_CODE (inc_src) != PLUS
-  || XEXP (inc_src, 0) != reg
-  || XEXP (inc_src, 1) != constm1_rtx)
+  || !rtx_equal_p (XEXP (inc_src, 0), reg)
+  || !CONST_IN

[PATCH v5 0/5] arm: Add support for MVE Tail-Predicated Low Overhead Loops

2024-02-22 Thread Andre Vieira
Hi,

This is a reworked patch series from.  The main differences are a further split
of patches, where:
[1/5] is arm specific and has been approved before,
[2/5] is target agnostic, has had no substantial changes from v3.
[3/5] new arm specific patch that is split from the original last patch and
annotates across lane instructions that are safe for tail predication if their
tail predicated operands are zeroed.
[4/5] new arm specific patch that could be committed indepdent of series to fix
an obvious issue and remove unused unspecs & iterators.
[5/5] (v3-v4) reworked last patch refactoring the implicit predication and some 
other
validity checks, (v4-v5) removed the expectation that vctp instructions are
always zero extended after this was fixed on trunk.

Original cover letter:
This patch adds support for Arm's MVE Tail Predicated Low Overhead Loop
feature.

The M-class Arm-ARM:
https://developer.arm.com/documentation/ddi0553/bu/?lang=en
Section B5.5.1 "Loop tail predication" describes the feature
we are adding support for with this patch (although
we only add codegen for DLSTP/LETP instruction loops).

Previously with commit d2ed233cb94 we'd added support for
non-MVE DLS/LE loops through the loop-doloop pass, which, given
a standard MVE loop like:

```
void  __attribute__ ((noinline)) test (int16_t *a, int16_t *b, int16_t *c, int 
n)
{
  while (n > 0)
{
  mve_pred16_t p = vctp16q (n);
  int16x8_t va = vldrhq_z_s16 (a, p);
  int16x8_t vb = vldrhq_z_s16 (b, p);
  int16x8_t vc = vaddq_x_s16 (va, vb, p);
  vstrhq_p_s16 (c, vc, p);
  c+=8;
  a+=8;
  b+=8;
  n-=8;
}
}
```
.. would output:

```

dls lr, lr
.L3:
vctp.16 r3
vmrsip, P0  @ movhi
sxthip, ip
vmsr P0, ip @ movhi
mov r4, r0
vpst
vldrht.16   q2, [r4]
mov r4, r1
vmovq3, q0
vpst
vldrht.16   q1, [r4]
mov r4, r2
vpst
vaddt.i16   q3, q2, q1
subsr3, r3, #8
vpst
vstrht.16   q3, [r4]
addsr0, r0, #16
addsr1, r1, #16
addsr2, r2, #16
le  lr, .L3
```

where the LE instruction will decrement LR by 1, compare and
branch if needed.

(there are also other inefficiencies with the above code, like the
pointless vmrs/sxth/vmsr on the VPR and the adds not being merged
into the vldrht/vstrht as a #16 offsets and some random movs!
But that's different problems...)

The MVE version is similar, except that:
* Instead of DLS/LE the instructions are DLSTP/LETP.
* Instead of pre-calculating the number of iterations of the
  loop, we place the number of elements to be processed by the
  loop into LR.
* Instead of decrementing the LR by one, LETP will decrement it
  by FPSCR.LTPSIZE, which is the number of elements being
  processed in each iteration: 16 for 8-bit elements, 5 for 16-bit
  elements, etc.
* On the final iteration, automatic Loop Tail Predication is
  performed, as if the instructions within the loop had been VPT
  predicated with a VCTP generating the VPR predicate in every
  loop iteration.

The dlstp/letp loop now looks like:

```

dlstp.16lr, r3
.L14:
mov r3, r0
vldrh.16q3, [r3]
mov r3, r1
vldrh.16q2, [r3]
mov r3, r2
vadd.i16  q3, q3, q2
addsr0, r0, #16
vstrh.16q3, [r3]
addsr1, r1, #16
addsr2, r2, #16
letplr, .L14

```

Since the loop tail predication is automatic, we have eliminated
the VCTP that had been specified by the user in the intrinsic
and converted the VPT-predicated instructions into their
unpredicated equivalents (which also saves us from VPST insns).

The LE instruction here decrements LR by 8 in each iteration.

Stam Markianos-Wright (1):
  arm: Add define_attr to to create a mapping between MVE predicated and
unpredicated insns

Andre Vieira (4):
  doloop: Add support for predicated vectorized loops
  arm: Annotate instructions with mve_safe_imp_xlane_pred
  arm: Fix a wrong attribute use and remove unused unspecs and iterators
  arm: Add support for MVE Tail-Predicated Low Overhead Loops


-- 
2.17.1


[COMMITTED] warn-access: Fix handling of unnamed types [PR109804]

2024-02-22 Thread Andrew Pinski
This looks like an oversight of handling DEMANGLE_COMPONENT_UNNAMED_TYPE.
DEMANGLE_COMPONENT_UNNAMED_TYPE only has the u.s_number.number set while
the code expected newc.u.s_binary.left would be valid.
So this treats DEMANGLE_COMPONENT_UNNAMED_TYPE like we treat function paramaters
(DEMANGLE_COMPONENT_FUNCTION_PARAM) and template paramaters 
(DEMANGLE_COMPONENT_TEMPLATE_PARAM).

Note the code in the demangler does this when it sets 
DEMANGLE_COMPONENT_UNNAMED_TYPE:
  ret->type = DEMANGLE_COMPONENT_UNNAMED_TYPE;
  ret->u.s_number.number = num;

Committed as obvious after bootstrap/test on x86_64-linux-gnu
Will commit to other branches in a few days.

PR tree-optimization/109804

gcc/ChangeLog:

* gimple-ssa-warn-access.cc (new_delete_mismatch_p): Handle
DEMANGLE_COMPONENT_UNNAMED_TYPE.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wmismatched-new-delete-8.C: New test.

Signed-off-by: Andrew Pinski 
---
 gcc/gimple-ssa-warn-access.cc |  1 +
 .../g++.dg/warn/Wmismatched-new-delete-8.C| 42 +++
 2 files changed, 43 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/warn/Wmismatched-new-delete-8.C

diff --git a/gcc/gimple-ssa-warn-access.cc b/gcc/gimple-ssa-warn-access.cc
index cd083ab2237..dedaae27b31 100644
--- a/gcc/gimple-ssa-warn-access.cc
+++ b/gcc/gimple-ssa-warn-access.cc
@@ -1701,6 +1701,7 @@ new_delete_mismatch_p (const demangle_component &newc,
 
 case DEMANGLE_COMPONENT_FUNCTION_PARAM:
 case DEMANGLE_COMPONENT_TEMPLATE_PARAM:
+case DEMANGLE_COMPONENT_UNNAMED_TYPE:
   return newc.u.s_number.number != delc.u.s_number.number;
 
 case DEMANGLE_COMPONENT_CHARACTER:
diff --git a/gcc/testsuite/g++.dg/warn/Wmismatched-new-delete-8.C 
b/gcc/testsuite/g++.dg/warn/Wmismatched-new-delete-8.C
new file mode 100644
index 000..0ddc056c6df
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wmismatched-new-delete-8.C
@@ -0,0 +1,42 @@
+/* PR tree-optimization/109804 */
+/* { dg-do compile { target c++11 } } */
+/* { dg-options "-Wall" } */
+
+/* Here we used to ICE in new_delete_mismatch_p because
+   we didn't handle unnamed types from the demangler 
(DEMANGLE_COMPONENT_UNNAMED_TYPE). */
+
+template 
+static inline T * construct_at(void *at, ARGS && args)
+{
+ struct Placeable : T
+ {
+  Placeable(ARGS && args) : T(args) { }
+  void * operator new (long unsigned int, void *ptr) { return ptr; }
+  void operator delete (void *, void *) { }
+ };
+ return new (at) Placeable(static_cast(args));
+}
+template 
+struct Reconstructible
+{
+  char _space[sizeof(MT)];
+  Reconstructible() { }
+};
+template 
+struct Constructible : Reconstructible
+{
+ Constructible(){}
+};
+struct A { };
+struct B
+{
+ Constructible a { };
+ B(int) { }
+};
+Constructible b { };
+void f()
+{
+  enum { ENUM_A = 1 };
+  enum { ENUM_B = 1 };
+  construct_at(b._space, ENUM_B);
+}
-- 
2.43.0



Re: [PATCH] AArch64: memcpy/memset expansions should not emit LDP/STP [PR113618]

2024-02-22 Thread Wilco Dijkstra
Hi Richard,

> It looks like this is really doing two things at once: disabling the
> direct emission of LDP/STP Qs, and switching the GPR handling from using
> pairs of DImode moves to single TImode moves.  At least, that seems to be
> the effect of...

No it still uses TImode for the !TARGET_SIMD case.

> +   if (GET_MODE_SIZE (mode_iter.require ()) <= MIN (size, 16))
> + mode = mode_iter.require ();

> ...hard-coding 16 here and...

This only affects the Q register case.

> -  if (size > 0 && size < copy_max / 2 && !STRICT_ALIGNMENT)
> +  if (size > 0 && size < 16 && !STRICT_ALIGNMENT)

> ...changing this limit from 8 to 16 for non-SIMD copies.
>
> Is that deliberate?  If so, please mention that kind of thing in the
> covering note.  It sounded like this was intended to change the handling
> of vector moves only.

Yes it's deliberate. It now basically treats everything as blocks of 16 bytes
which has a nice simplifying effect. I've added a note.

> This means that, for GPRs, we are now effectively using the double-word
> move patterns to get an LDP/STP indirectly, rather than directly as before.

No, there is no difference here.

> That seems OK, and I suppose might be slightly preferable to the current
> code for things like:
>
>  char a[31], b[31];
>  void f() { __builtin_memcpy(a, b, 31); }

Yes an unaligned tail improves slightly by using blocks of 16 bytes.
It's a very rare case, both -mgeneral-regs-only is rarely used, and most
fixed-size copies are a nice multiple of 8.

> But that raises the question: should we do the same thing for Q registers
> and V2x16QImode?

I don't believe it makes sense to use those complex types. And it likely
blocks optimizations in a similar way as UNSPEC does.

> If emitting individual vector loads and stores is better than using
> V2x16QI (and I can see that it might be), then why isn't the same
> true for GPRs and DImode vs TImode?

It might be feasible to do the same for scalar copies. But given that
using TImode works fine, there is no regression here, and use of
-mgeneral-regs-only is rare, what would the benefit be of doing that?

> I think the final version of this patch should go in ahead of the
> clean-up patch.  As I mentioned in the other review, I think the
> clean-up should wait for GCC 15.

I've rebased it to the trunk.

Cheers,
Wilco


v2: Rebase to trunk

The new RTL introduced for LDP/STP results in regressions due to use of UNSPEC.
Given the new LDP fusion pass is good at finding LDP opportunities, change the
memcpy, memmove and memset expansions to emit single vector loads/stores.
This fixes the regression and enables more RTL optimization on the standard
memory accesses.  Handling of unaligned tail of memcpy/memmove is improved
with -mgeneral-regs-only.  SPEC2017 performance improves slightly.  Codesize
is a bit worse due to missed LDP opportunities as discussed in the PR.

Passes regress, OK for commit?

gcc/ChangeLog:
PR target/113618
* config/aarch64/aarch64.cc (aarch64_copy_one_block): Remove. 
(aarch64_expand_cpymem): Emit single load/store only.
(aarch64_set_one_block): Emit single stores only.

gcc/testsuite/ChangeLog:
PR target/113618
* gcc.target/aarch64/pr113618.c: New test.

---

diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 
16318bf925883ecedf9345e53fc0824a553b2747..0a28e033088a00818c6ed9fa8c15ecdee5a86c35
 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -26465,33 +26465,6 @@ aarch64_progress_pointer (rtx pointer)
   return aarch64_move_pointer (pointer, GET_MODE_SIZE (GET_MODE (pointer)));
 }
 
-typedef auto_vec, 12> copy_ops;
-
-/* Copy one block of size MODE from SRC to DST at offset OFFSET.  */
-static void
-aarch64_copy_one_block (copy_ops &ops, rtx src, rtx dst,
-   int offset, machine_mode mode)
-{
-  /* Emit explict load/store pair instructions for 32-byte copies.  */
-  if (known_eq (GET_MODE_SIZE (mode), 32))
-{
-  mode = V4SImode;
-  rtx src1 = adjust_address (src, mode, offset);
-  rtx dst1 = adjust_address (dst, mode, offset);
-  rtx reg1 = gen_reg_rtx (mode);
-  rtx reg2 = gen_reg_rtx (mode);
-  rtx load = aarch64_gen_load_pair (reg1, reg2, src1);
-  rtx store = aarch64_gen_store_pair (dst1, reg1, reg2);
-  ops.safe_push ({ load, store });
-  return;
-}
-
-  rtx reg = gen_reg_rtx (mode);
-  rtx load = gen_move_insn (reg, adjust_address (src, mode, offset));
-  rtx store = gen_move_insn (adjust_address (dst, mode, offset), reg);
-  ops.safe_push ({ load, store });
-}
-
 /* Expand a cpymem/movmem using the MOPS extension.  OPERANDS are taken
from the cpymem/movmem pattern.  IS_MEMMOVE is true if this is a memmove
rather than memcpy.  Return true iff we succeeded.  */
@@ -26527,7 +26500,7 @@ aarch64_expand_cpymem (rtx *operands, bool is_memmove)
   rtx src = operands[1];
   unsigned align = UINTVAL (operands[3]);
   rtx

[PATCH] developer option: -fdump-generic-nodes; initial incorporation

2024-02-22 Thread Robert Dubner
As part of an effort to learn how create a GENERIC tree in order to
implement a
COBOL front end, I created the dump_generic_nodes(), which accepts a
function_decl at the point it is provided to the middle end.  The routine
generates three files.  One is ASCII, the second is HTML; they contain the
tree
in a human-readable form.  The third is JSON.

This commit modifies common.opt to accept the -fdump-generic-nodes
command-line
option, creates the dump-generic-nodes.cc and .h files to implement it,
and
inserts a call to the dump_generic_nodes() function near the top of
gimplify_function_tree() in gcc/gimplify.cc

This patch has been tested on X86_64-linux-gnu.  I haven't tried to
provide 
testcases for the automated system because 1) I haven't learned how to do
that,
and 2), I am not sure how to test this feature.  On the one hand, the
compiler
isn't affected when the switch isn't present; when it is present it seems
to
work on simple source code.

Legal requirements:  The FSF has on file an "employer disclaimer" for me.

I am using the "Signed off by" tag in an attempt to cover the legal bases;
I
trust I will be apprised of anything else that needs to be done.

gcc/ChangeLog:

* developer options: -fdump-generic-nodes initial incorporation

Signed-off-by: Robert Dubner 
---
 gcc/Makefile.in   |3 +-
 gcc/common.opt|4 +
 gcc/dump-generic-nodes.cc | 1958 +
 gcc/dump-generic-nodes.h  |   26 +
 gcc/gimplify.cc   |3 +
 5 files changed, 1993 insertions(+), 1 deletion(-)
 create mode 100644 gcc/dump-generic-nodes.cc
 create mode 100644 gcc/dump-generic-nodes.h

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index a74761b7ab3..81922b0884c 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1441,6 +1441,7 @@ OBJS = \
domwalk.o \
double-int.o \
dse.o \
+   dump-generic-nodes.o \
dumpfile.o \
dwarf2asm.o \
dwarf2cfi.o \
@@ -3857,7 +3858,7 @@ PLUGIN_HEADERS = $(TREE_H) $(CONFIG_H) $(SYSTEM_H)
coretypes.h $(TM_H) \
   hash-set.h dominance.h cfg.h cfgrtl.h cfganal.h cfgbuild.h cfgcleanup.h
\
   lcm.h cfgloopmanip.h file-prefix-map.h builtins.def $(INSN_ATTR_H) \
   pass-instances.def params.list $(srcdir)/../include/gomp-constants.h \
-  $(EXPR_H) $(srcdir)/analyzer/*.h
+  $(EXPR_H) $(srcdir)/analyzer/*.h dump-generic-nodes.h
 
 # generate the 'build fragment' b-header-vars
 s-header-vars: Makefile
diff --git a/gcc/common.opt b/gcc/common.opt
index 51c4a17da83..751b9b1f0cc 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1583,6 +1583,10 @@ fdump-passes
 Common Var(flag_dump_passes) Init(0)
 Dump optimization passes.
 
+fdump-generic-nodes
+Common Var(flag_dump_generic_nodes) Init(0)
+Dump GENERIC trees for each function in three files: .nodes,
.nodes.html, and .json
+
 fdump-unnumbered
 Common Var(flag_dump_unnumbered)
 Suppress output of instruction numbers, line number notes and addresses
in debugging dumps.
diff --git a/gcc/dump-generic-nodes.cc b/gcc/dump-generic-nodes.cc
new file mode 100644
index 000..d44119116d2
--- /dev/null
+++ b/gcc/dump-generic-nodes.cc
@@ -0,0 +1,1958 @@
+/* Prints out a tree of generic/gimple nodes in human readable form, both
in
+   straight text and in HTML. The entry point is dump_generic_nodes().
+
+   Copyright(C) 1990-2024 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or(at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "tree.h"
+#include "cgraph.h"
+#include "diagnostic.h"
+#include "varasm.h"
+#include "print-rtl.h"
+#include "stor-layout.h"
+#include "langhooks.h"
+#include "tree-iterator.h"
+#include "gimple-pretty-print.h"
+#include "tree-cfg.h"
+#include "dumpfile.h"
+
+#undef DEFTREESTRUCT
+#define DEFTREESTRUCT(VAL, NAME) NAME,
+static const char *ts_enum_names[] =
+  {
+  #include "treestruct.def"
+  };
+#undef DEFTREESTRUCT
+
+#define ADD_FLAG(accessor,text)if(accessor(node)){strcat(ach," " text);}
+
+static FILE *ftext = NULL;
+static FILE *fhtml = NULL;
+static FILE *fjson = NULL;
+
+static int json_level = 0;
+static const char *json_comma;
+static const int spaces_per_indent = 2;
+
+static void rjd_print_node(tree node);
+
+static int phase = 1;
+
+/* Define the hash table of nodes already seen.
+   Such nodes are not repeated; brief cross-references are used.  */
+
+stru

[PATCH] c, v2: Handle scoped attributes in __has*attribute and scoped attribute parsing changes in -std=c11 etc. modes [PR114007]

2024-02-22 Thread Jakub Jelinek
On Thu, Feb 22, 2024 at 04:35:22PM +0100, Michael Matz wrote:
> On Thu, 22 Feb 2024, Jakub Jelinek wrote:
> 
> > > Hmm, shouldn't you be able to use (nonexistence of) the PREV_WHITE flag 
> > > on 
> > > the second COLON token to see that it's indeed a '::' without intervening 
> > > whitespace?  Instead of setting a new flag on the first COLON token?
> > > 
> > > I.e. something like this:
> > > 
> > >if (c_parser_next_token_is (parser, CPP_SCOPE)
> > > -  || (loose_scope_p
> > > - && c_parser_next_token_is (parser, CPP_COLON)
> > >   && c_parser_peek_2nd_token (parser)->type == CPP_COLON))
> > > + && !(c_parser_peek_2nd_token (parser)->flags & PREV_WHITE)))
> > > 
> > > ?
> > 
> > That doesn't seem to work.
> 
> Too bad then.  I had hoped it would make the code easier without changes 
> to c-lex.  Well, then ... was worth a try, I'll crouch back under my stone
> :)

Actually, I could make it work with two simple add_flags |= PREV_WHITE;
in c_lex_with_flags.  PREV_WHITE in FE tokens is only checked in this new
spot in [[]] C parsing and in
  /* If we find the sequence `[:' after a template-name, it's probably
 a digraph-typo for `< ::'. Substitute the tokens and check if we can
 parse correctly the argument list.  */
  if (((next_token = cp_lexer_peek_token (parser->lexer))->type
   == CPP_OPEN_SQUARE)
  && next_token->flags & DIGRAPH
  && ((next_token_2 = cp_lexer_peek_nth_token (parser->lexer, 2))->type
  == CPP_COLON)
  && !(next_token_2->flags & PREV_WHITE))
{
  cp_parser_parse_tentatively (parser);
  /* Change `:' into `::'.  */
  next_token_2->type = CPP_SCOPE;
  /* Consume the first token (CPP_OPEN_SQUARE - which we pretend it is
 CPP_LESS.  */
  cp_lexer_consume_token (parser->lexer);
in C++ FE and the same workaround in another spot.  But seems -std=c++98
-fpermissive on
struct S{};
template  struct U{};
U<::S> u;
U<:/**/:S> v;
#define FOO <:
#define BAR
U FOO BAR S> w;
actually doesn't change, u is accepted with -fpermissive and v/w are rejected
without/with the patch, apparently it already has PREV_WHITE flag set in
those cases for some reason.

2024-02-22  Jakub Jelinek  

PR c/114007
gcc/
* doc/extend.texi: (__extension__): Remove comments about scope
tokens vs. two colons.
gcc/c-family/
* c-lex.cc (c_common_has_attribute): Parse 2 CPP_COLONs with
the second one without PREV_WHITE flag the same as CPP_SCOPE.
(c_lex_with_flags): For CPP_PADDING or CPP_COMMENT, or PREV_WHITE
into add_flags.
gcc/c/
* c-parser.cc (c_parser_std_attribute): Remove loose_scope_p argument.
Instead of checking it, parse 2 CPP_COLONs with the second one without
PREV_WHITE flag the same as CPP_SCOPE.
(c_parser_std_attribute_list): Remove loose_scope_p argument, don't
pass it to c_parser_std_attribute.
(c_parser_std_attribute_specifier): Adjust c_parser_std_attribute_list
caller.
gcc/testsuite/
* gcc.dg/c23-attr-syntax-6.c: Adjust testcase for :: being valid
even in -std=c11 even without __extension__ and : : etc. not being
valid anymore even with __extension__.
* gcc.dg/c23-attr-syntax-7.c: Likewise.
* gcc.dg/c23-attr-syntax-8.c: New test.

--- gcc/doc/extend.texi.jj  2024-02-22 10:10:18.907029080 +0100
+++ gcc/doc/extend.texi 2024-02-22 16:06:33.197555930 +0100
@@ -12626,10 +12626,7 @@ In C, writing:
 @end smallexample
 
 suppresses warnings about using @samp{[[]]} attributes in C versions
-that predate C23@.  Since the scope token @samp{::} is not a single
-lexing token in earlier versions of C, this construct also allows two colons
-to be used in place of @code{::}.  GCC does not check whether the two
-colons are immediately adjacent.
+that predate C23@.
 @end itemize
 
 @code{__extension__} has no effect aside from this.
--- gcc/c-family/c-lex.cc.jj2024-02-22 10:09:48.408450163 +0100
+++ gcc/c-family/c-lex.cc   2024-02-22 16:40:59.00382 +0100
@@ -357,7 +357,27 @@ c_common_has_attribute (cpp_reader *pfil
   do
nxt_token = cpp_peek_token (pfile, idx++);
   while (nxt_token->type == CPP_PADDING);
-  if (nxt_token->type == CPP_SCOPE)
+  if (!c_dialect_cxx ()
+ && flag_iso
+ && !flag_isoc23
+ && nxt_token->type == CPP_COLON)
+   {
+ const cpp_token *prev_token = nxt_token;
+ nxt_token = cpp_peek_token (pfile, idx);
+ if (nxt_token->type == CPP_COLON
+ && (nxt_token->flags & PREV_WHITE) == 0)
+   {
+ /* __has_attribute (vendor::attr) in -std=c17 etc. modes.
+:: isn't CPP_SCOPE but 2 CPP_COLON tokens, where the
+second one shouldn't have PREV_WHITE flag to distinguish
+it from : :.  */
+ have_scope = true;
+ get_token_no_padding (pfile); // Eat first colon.
+   }
+ else
+ 

[PATCH v1 05/13] Reuse MinGW from i386 for AArch64

2024-02-22 Thread Evgeny Karpov
A ChangeLog template using "Moved... ...here" has been generated by 
contrib/mklog.py.
It seems that it needs modification.

Regards,
Evgeny

-Original Message-
Thursday, February 22, 2024 12:11 PM
Richard Earnshaw (lists) wrote:

> The ChangeLog has to be expressed in present tense, as mandated by the 
> standard; s/Moved/Move/g etc.

Agreed, but that's a detail that we can get to once the patch has been properly 
reviewed.



[PATCH v1 05/13] Reuse MinGW from i386 for AArch64

2024-02-22 Thread Evgeny Karpov
Hello Andrew,

Thank you for the review!

Renaming functions and comments that reference functions with the
i386_ prefix can trigger extensive changes. This task should
ideally be handled in a separate follow-up patch.

The primary goal of the renaming changes in
https://gcc.gnu.org/pipermail/gcc-patches/attachments/20240221/8f41fb9a/attachment-0001.txt
 
was to avoid direct references from the aarch64 target to 
functions with i386_ prefix.

Regards,
Evgeny

-Original Message-
Wednesday, February 21, 2024 7:50 PM
Andrew Pinski wrote:

On Wed, Feb 21, 2024 at 10:38 AM Evgeny Karpov  
wrote:
>
>

In config/i386/winnt.cc there are many x86_64 comments and many function names 
that use i386 in them? When moving it seems like better to rename them and 
remove references to 86.
I don't see any changes that rename the functions or comments there.

Thanks,
Andrew


Re: [PATCH] c: Handle scoped attributes in __has*attribute and scoped attribute parsing changes in -std=c11 etc. modes [PR114007]

2024-02-22 Thread Michael Matz
Hi,

On Thu, 22 Feb 2024, Jakub Jelinek wrote:

> > Hmm, shouldn't you be able to use (nonexistence of) the PREV_WHITE flag on 
> > the second COLON token to see that it's indeed a '::' without intervening 
> > whitespace?  Instead of setting a new flag on the first COLON token?
> > 
> > I.e. something like this:
> > 
> >if (c_parser_next_token_is (parser, CPP_SCOPE)
> > -  || (loose_scope_p
> > - && c_parser_next_token_is (parser, CPP_COLON)
> >   && c_parser_peek_2nd_token (parser)->type == CPP_COLON))
> > + && !(c_parser_peek_2nd_token (parser)->flags & PREV_WHITE)))
> > 
> > ?
> 
> That doesn't seem to work.

Too bad then.  I had hoped it would make the code easier without changes 
to c-lex.  Well, then ... was worth a try, I'll crouch back under my stone
:)


Ciao,
Michael.


Re: [PATCH] c: Handle scoped attributes in __has*attribute and scoped attribute parsing changes in -std=c11 etc. modes [PR114007]

2024-02-22 Thread Jakub Jelinek
On Thu, Feb 22, 2024 at 03:59:31PM +0100, Michael Matz wrote:
> Hello,
> 
> On Thu, 22 Feb 2024, Jakub Jelinek wrote:
> 
> > So, the following patch adds a flag during preprocessing at the point
> > where we normally create CPP_SCOPE tokens out of 2 consecutive colons
> > on the first CPP_COLON to mark the consecutive case (as we are tight
> > on the bits, I've reused the PURE_ZERO flag, which is used just by the
> > C++ FE and only ever set (both C and C++) on CPP_NUMBER tokens, this
> > new flag has the same value and is only ever used on CPP_COLON tokens)
> 
> Hmm, shouldn't you be able to use (nonexistence of) the PREV_WHITE flag on 
> the second COLON token to see that it's indeed a '::' without intervening 
> whitespace?  Instead of setting a new flag on the first COLON token?
> 
> I.e. something like this:
> 
>if (c_parser_next_token_is (parser, CPP_SCOPE)
> -  || (loose_scope_p
> - && c_parser_next_token_is (parser, CPP_COLON)
>   && c_parser_peek_2nd_token (parser)->type == CPP_COLON))
> + && !(c_parser_peek_2nd_token (parser)->flags & PREV_WHITE)))
> 
> ?

That doesn't seem to work.
Compared to the posted patch it doesn't raise the 2 extra errors on
gcc.dg/c23-attr-syntax-6.c
#define JOIN2(A, B) A##B
typedef int [[__extension__ gnu JOIN2(:,:) vector_size (4)]] b5;
and that is just fine, that is error recovery after another error,
but doesn't diagnose:
#define BAR :
typedef int [[__extension__ gnu BAR BAR vector_size (4)]] b8;
nor
#define JOIN(A, B) A/**/B
typedef int [[__extension__ gnu JOIN(:,:) vector_size (4)]] b10;
(nor similar cases without __extension__).

Maybe it is about whether there are CPP_PADDING tokens in between if
PREV_WHITE is missing, but on c_parser_peek*_token we don't know if
there were any.  Sure, on the c_common_has_attribute side that could
be done just by dropping the second loop.

--- gcc/doc/extend.texi.jj  2024-02-22 10:10:18.907029080 +0100
+++ gcc/doc/extend.texi 2024-02-22 16:06:33.197555930 +0100
@@ -12626,10 +12626,7 @@ In C, writing:
 @end smallexample
 
 suppresses warnings about using @samp{[[]]} attributes in C versions
-that predate C23@.  Since the scope token @samp{::} is not a single
-lexing token in earlier versions of C, this construct also allows two colons
-to be used in place of @code{::}.  GCC does not check whether the two
-colons are immediately adjacent.
+that predate C23@.
 @end itemize
 
 @code{__extension__} has no effect aside from this.
--- gcc/c-family/c-lex.cc.jj2024-02-22 10:09:48.408450163 +0100
+++ gcc/c-family/c-lex.cc   2024-02-22 16:09:50.822825035 +0100
@@ -357,7 +357,29 @@ c_common_has_attribute (cpp_reader *pfil
   do
nxt_token = cpp_peek_token (pfile, idx++);
   while (nxt_token->type == CPP_PADDING);
-  if (nxt_token->type == CPP_SCOPE)
+  if (!c_dialect_cxx ()
+ && flag_iso
+ && !flag_isoc23
+ && nxt_token->type == CPP_COLON)
+   {
+ const cpp_token *prev_token = nxt_token;
+ do
+   nxt_token = cpp_peek_token (pfile, idx++);
+ while (nxt_token->type == CPP_PADDING);
+ if (nxt_token->type == CPP_COLON
+ && (nxt_token->flags & PREV_WHITE) == 0)
+   {
+ /* __has_attribute (vendor::attr) in -std=c17 etc. modes.
+:: isn't CPP_SCOPE but 2 CPP_COLON tokens, where the
+second one shouldn't have PREV_WHITE flag to distinguish
+it from : :.  */
+ have_scope = true;
+ get_token_no_padding (pfile); // Eat first colon.
+   }
+ else
+   nxt_token = prev_token;
+   }
+  if (nxt_token->type == CPP_SCOPE || have_scope)
{
  have_scope = true;
  get_token_no_padding (pfile); // Eat scope.
--- gcc/c/c-parser.cc.jj2024-02-22 10:09:48.467449349 +0100
+++ gcc/c/c-parser.cc   2024-02-22 16:11:05.320795586 +0100
@@ -5705,8 +5705,7 @@ c_parser_omp_sequence_args (c_parser *pa
indicates whether this relaxation is in effect.  */
 
 static tree
-c_parser_std_attribute (c_parser *parser, bool for_tm,
-   bool loose_scope_p = false)
+c_parser_std_attribute (c_parser *parser, bool for_tm)
 {
   c_token *token = c_parser_peek_token (parser);
   tree ns, name, attribute;
@@ -5720,9 +5719,10 @@ c_parser_std_attribute (c_parser *parser
   name = canonicalize_attr_name (token->value);
   c_parser_consume_token (parser);
   if (c_parser_next_token_is (parser, CPP_SCOPE)
-  || (loose_scope_p
+  || (!flag_isoc23
  && c_parser_next_token_is (parser, CPP_COLON)
- && c_parser_peek_2nd_token (parser)->type == CPP_COLON))
+ && c_parser_peek_2nd_token (parser)->type == CPP_COLON
+ && (c_parser_peek_2nd_token (parser)->flags & PREV_WHITE) == 0))
 {
   ns = name;
   if (c_parser_next_token_is (parser, CPP_COLON))
@@ -5841,8 +5841,7 @@ c_parser_std_attribute (c_parser *parser
 }
 
 static tree
-c_parser_s

RE: [EXTERNAL] Re: [PATCH v1 03/13] aarch64: Mark x18 register as a fixed register for MS ABI

2024-02-22 Thread Evgeny Karpov
Hi Richard,

Thanks for the review!

TARGET_ARM64_MS_ABI refers to the official Microsoft ARM64 ABI naming used for 
the target. 
If AARCH64 is a more preferred name, it will be changed in PATCH v2.

https://learn.microsoft.com/en-us/cpp/build/arm64-windows-abi-conventions?view=msvc-170

Regards,
Evgeny

-Original Message-
Thursday, February 22, 2024 2:11 PM
Richard Earnshaw (lists) wrote:

On 21/02/2024 18:30, Evgeny Karpov wrote:
>
+   tm_defines="${tm_defines} TARGET_ARM64_MS_ABI=1"

I missed this on first reading...

The GCC port name uses AARCH64, please use that internally rather than other 
names.  The only time when we should be using ARM64 is when it's needed for 
compatibility with other compilers and that doesn't apply here AFAICT.

R.


Re: [PATCH] c: Handle scoped attributes in __has*attribute and scoped attribute parsing changes in -std=c11 etc. modes [PR114007]

2024-02-22 Thread Michael Matz
Hello,

On Thu, 22 Feb 2024, Jakub Jelinek wrote:

> So, the following patch adds a flag during preprocessing at the point
> where we normally create CPP_SCOPE tokens out of 2 consecutive colons
> on the first CPP_COLON to mark the consecutive case (as we are tight
> on the bits, I've reused the PURE_ZERO flag, which is used just by the
> C++ FE and only ever set (both C and C++) on CPP_NUMBER tokens, this
> new flag has the same value and is only ever used on CPP_COLON tokens)

Hmm, shouldn't you be able to use (nonexistence of) the PREV_WHITE flag on 
the second COLON token to see that it's indeed a '::' without intervening 
whitespace?  Instead of setting a new flag on the first COLON token?

I.e. something like this:

   if (c_parser_next_token_is (parser, CPP_SCOPE)
-  || (loose_scope_p
- && c_parser_next_token_is (parser, CPP_COLON)
  && c_parser_peek_2nd_token (parser)->type == CPP_COLON))
+ && !(c_parser_peek_2nd_token (parser)->flags & PREV_WHITE)))

?


Ciao,
Michael.


Re: [PATCH] libcpp: Improve location for macro names [PR66290]

2024-02-22 Thread Lewis Hyatt
On Thu, Feb 22, 2024 at 3:56 AM Richard Biener
 wrote:
>
> On Tue, Feb 20, 2024 at 3:33 PM Lewis Hyatt  wrote:
> >
> > On Mon, Feb 19, 2024 at 11:36 PM Alexandre Oliva  wrote:
> > >
> > > This backport for gcc-13 is the first of two required for the
> > > g++.dg/pch/line-map-3.C test to stop hitting a variant of the known
> > > problem mentioned in that testcase: on riscv64-elf and riscv32-elf,
> > > after restoring the PCH, the location of the macros is mentioned as if
> > > they were on line 3 rather than 2, so even the existing xfails fail.  I
> > > think this might be too much to backport, and I'm ready to use an xfail
> > > instead, but since this would bring more predictability, I thought I'd
> > > ask whether you'd find this backport acceptable.
> > >
> > > Regstrapped on x86_64-linux-gnu, along with other backports, and tested
> > > manually on riscv64-elf.  Ok to install?
> >
> > Sorry that test is causing a problem, I hadn't realized at first that
> > the wrong output was target-dependent. I feel like simply deleting
> > this test g++.dg/pch/line-map-3.C from GCC 13 branch should be fine
> > too, as a safer alternative, if release managers prefer?
>
> Yes please.
>
> Richard.

Committed that removal as r13-8353.

-Lewis


[PATCH] tree-optimization/114048 - ICE in copy_reference_ops_from_ref

2024-02-22 Thread Richard Biener
The following adds another omission to the assert verifying we're
not running into spurious off == -1.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/114048
* tree-ssa-sccvn.cc (copy_reference_ops_from_ref): MEM_REF
can also produce -1 off.

* gcc.dg/torture/pr114048.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr114048.c | 25 +
 gcc/tree-ssa-sccvn.cc   |  2 ++
 2 files changed, 27 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr114048.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr114048.c 
b/gcc/testsuite/gcc.dg/torture/pr114048.c
new file mode 100644
index 000..338000b3006
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr114048.c
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+
+typedef struct {
+  void *child[2];
+  char otherbits;
+} critbit0_node;
+
+int allprefixed_traverse(char *top)
+{
+  if (top)
+{
+  critbit0_node *q = (void *)top - 1;
+  int direction = 0;
+  for (;; ++direction)
+   switch (allprefixed_traverse(q->child[direction]))
+ {
+ case 1:
+   break;
+ case 0:
+   return 0;
+ default:
+   return 1;
+ }
+}
+}
diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc
index 3e93685e80a..2587eb1c505 100644
--- a/gcc/tree-ssa-sccvn.cc
+++ b/gcc/tree-ssa-sccvn.cc
@@ -1185,6 +1185,8 @@ copy_reference_ops_from_ref (tree ref, 
vec *result)
  else
{
  gcc_assert (known_ne (op.off, -1)
+ /* The constant offset can be -1.  */
+ || op.opcode == MEM_REF
  /* Out-of-bound indices can compute to
 a known -1 offset.  */
  || ((op.opcode == ARRAY_REF
-- 
2.35.3


PING: [PATCH] x86-64: Check R_X86_64_CODE_6_GOTTPOFF support

2024-02-22 Thread H.J. Lu
On Sun, Feb 18, 2024 at 8:02 AM H.J. Lu  wrote:
>
> If assembler and linker supports
>
> add %reg1, name@gottpoff(%rip), %reg2
>
> with R_X86_64_CODE_6_GOTTPOFF, we can generate it instead of
>
> mov name@gottpoff(%rip), %reg2
> add %reg1, %reg2
>
> gcc/
>
> * configure.ac (HAVE_AS_R_X86_64_CODE_6_GOTTPOFF): Defined as 1
> if R_X86_64_CODE_6_GOTTPOFF is supported.
> * config.in: Regenerated.
> * configure: Likewise.
> * config/i386/predicates.md (apx_ndd_add_memory_operand): Allow
> UNSPEC_GOTNTPOFF if R_X86_64_CODE_6_GOTTPOFF is supported.
>
> gcc/testsuite/
>
> * gcc.target/i386/apx-ndd-tls-1b.c: New test.
> * lib/target-supports.exp
> (check_effective_target_code_6_gottpoff_reloc): New.
> ---
>  gcc/config.in |  7 +++
>  gcc/config/i386/predicates.md |  6 +-
>  gcc/configure | 62 +++
>  gcc/configure.ac  | 37 +++
>  .../gcc.target/i386/apx-ndd-tls-1b.c  |  9 +++
>  gcc/testsuite/lib/target-supports.exp | 48 ++
>  6 files changed, 168 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/apx-ndd-tls-1b.c
>
> diff --git a/gcc/config.in b/gcc/config.in
> index ce1d073833f..f3de4ba6776 100644
> --- a/gcc/config.in
> +++ b/gcc/config.in
> @@ -737,6 +737,13 @@
>  #endif
>
>
> +/* Define 0/1 if your assembler and linker support R_X86_64_CODE_6_GOTTPOFF.
> +   */
> +#ifndef USED_FOR_TARGET
> +#undef HAVE_AS_R_X86_64_CODE_6_GOTTPOFF
> +#endif
> +
> +
>  /* Define if your assembler supports relocs needed by -fpic. */
>  #ifndef USED_FOR_TARGET
>  #undef HAVE_AS_SMALL_PIC_RELOCS
> diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
> index 4c1aedd7e70..391f108c360 100644
> --- a/gcc/config/i386/predicates.md
> +++ b/gcc/config/i386/predicates.md
> @@ -2299,10 +2299,14 @@ (define_predicate "apx_ndd_memory_operand"
>
>  ;; Return true if OP is a memory operand which can be used in APX NDD
>  ;; ADD with register source operand.  UNSPEC_GOTNTPOFF memory operand
> -;; isn't allowed with APX NDD ADD.
> +;; is allowed with APX NDD ADD only if R_X86_64_CODE_6_GOTTPOFF works.
>  (define_predicate "apx_ndd_add_memory_operand"
>(match_operand 0 "memory_operand")
>  {
> +  /* OK if "add %reg1, name@gottpoff(%rip), %reg2" is supported.  */
> +  if (HAVE_AS_R_X86_64_CODE_6_GOTTPOFF)
> +return true;
> +
>op = XEXP (op, 0);
>
>/* Disallow APX NDD ADD with UNSPEC_GOTNTPOFF.  */
> diff --git a/gcc/configure b/gcc/configure
> index 41b978b0380..c59c971862c 100755
> --- a/gcc/configure
> +++ b/gcc/configure
> @@ -29834,6 +29834,68 @@ cat >>confdefs.h <<_ACEOF
>  _ACEOF
>
>
> +if echo "$ld_ver" | grep GNU > /dev/null; then
> +  if $gcc_cv_ld -V 2>/dev/null | grep elf_x86_64_sol2 > /dev/null; then
> +ld_ix86_gld_64_opt="-melf_x86_64_sol2"
> +  else
> +ld_ix86_gld_64_opt="-melf_x86_64"
> +  fi
> +fi
> +conftest_s='
> +   .text
> +   .globl  _start
> +   .type _start, @function
> +_start:
> +   addq%r23,foo@GOTTPOFF(%rip), %r15
> +   .section .tdata,"awT",@progbits
> +   .type foo, @object
> +foo:
> +   .quad 0'
> +{ $as_echo "$as_me:${as_lineno-$LINENO}: checking assembler for 
> R_X86_64_CODE_6_GOTTPOFF reloc" >&5
> +$as_echo_n "checking assembler for R_X86_64_CODE_6_GOTTPOFF reloc... " >&6; }
> +if ${gcc_cv_as_x86_64_code_6_gottpoff+:} false; then :
> +  $as_echo_n "(cached) " >&6
> +else
> +  gcc_cv_as_x86_64_code_6_gottpoff=no
> +  if test x$gcc_cv_as != x; then
> +$as_echo "$conftest_s" > conftest.s
> +if { ac_try='$gcc_cv_as $gcc_cv_as_flags  -o conftest.o conftest.s >&5'
> +  { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5
> +  (eval $ac_try) 2>&5
> +  ac_status=$?
> +  $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
> +  test $ac_status = 0; }; }
> +then
> +   if test x$gcc_cv_ld != x && test x$gcc_cv_objdump != x \
> +   && test x$gcc_cv_readelf != x \
> +   && $gcc_cv_readelf --relocs --wide conftest.o 2>&1 \
> +  | grep R_X86_64_CODE_6_GOTTPOFF > /dev/null 2>&1 \
> +   && $gcc_cv_ld $ld_ix86_gld_64_opt -o conftest conftest.o > 
> /dev/null 2>&1; then
> +  if $gcc_cv_objdump -dw conftest 2>&1 \
> + | grep "add \+\$0xf\+8,%r23,%r15" > /dev/null 2>&1; then
> +gcc_cv_as_x86_64_code_6_gottpoff=yes
> +  else
> +gcc_cv_as_x86_64_code_6_gottpoff=no
> +  fi
> +fi
> +rm -f conftest
> +else
> +  echo "configure: failed program was" >&5
> +  cat conftest.s >&5
> +fi
> +rm -f conftest.o conftest.s
> +  fi
> +fi
> +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: 
> $gcc_cv_as_x86_64_code_6_gottpoff" >&5
> +$as_echo "$gcc_cv_as_x86_64_code_6_gottpoff" >&6; }
> +
> +
> +
> +cat >>confdefs.h <<_

Re: [PATCH v9 1/2] Add condition coverage (MC/DC)

2024-02-22 Thread Jørgen Kvalsvik

On 22/02/2024 14:26, Jan Hubicka wrote:

Hello,

This patch adds support in gcc+gcov for modified condition/decision
coverage (MC/DC) with the -fcondition-coverage flag. MC/DC is a type of
test/code coverage and it is particularly important for safety-critical
applicaitons in industries like aviation and automotive. Notably, MC/DC
is required or recommended by:

 * DO-178C for the most critical software (Level A) in avionics.
 * IEC 61508 for SIL 4.
 * ISO 26262-6 for ASIL D.

 From the SQLite webpage:

 Two methods of measuring test coverage were described above:
 "statement" and "branch" coverage. There are many other test
 coverage metrics besides these two. Another popular metric is
 "Modified Condition/Decision Coverage" or MC/DC. Wikipedia defines
 MC/DC as follows:

 * Each decision tries every possible outcome.
 * Each condition in a decision takes on every possible outcome.
 * Each entry and exit point is invoked.
 * Each condition in a decision is shown to independently affect
   the outcome of the decision.

 In the C programming language where && and || are "short-circuit"
 operators, MC/DC and branch coverage are very nearly the same thing.
 The primary difference is in boolean vector tests. One can test for
 any of several bits in bit-vector and still obtain 100% branch test
 coverage even though the second element of MC/DC - the requirement
 that each condition in a decision take on every possible outcome -
 might not be satisfied.

 https://sqlite.org/testing.html#mcdc

MC/DC comes in different flavours, the most important being unique
cause MC/DC and masking MC/DC - this patch implements masking MC/DC,
which is works well with short circuiting semantics, and according to
John Chilenski's "An Investigation of Three Forms of the Modified
Condition Decision Coverage (MCDC) Criterion" (2001) is as good as
unique cause at catching bugs.

Whalen, Heimdahl, and De Silva "Efficient Test Coverage Measurement for
MC/DC" describes an algorithm for determining masking from an AST walk,
but my algorithm figures this out from analyzing the control flow graph.
The CFG is considered a binary decision diagram and an evaluation
becomes a path through the BDD, which is recorded. Certain paths will
mask ("null out") the contribution from earlier path segments, which can
be determined by finding short circuit endpoints. Masking is the
short circuiting of terms in the reverse-ordered Boolean function, and
the masked terms do not affect the decision like short-circuited
terms do not affect the decision.

A tag/discriminator mapping from gcond->uid is created during
gimplification and made available through the function struct. The
values are unimportant as long as basic conditions constructed from a
single Boolean expression are given the same identifier. This happens in
the breaking down of ANDIF/ORIF trees, so the coverage generally works
well for frontends that create such trees.

Like Whalen et al this implementation records coverage in fixed-size
bitsets which gcov knows how to interpret. This takes only a few bitwise
operations per condition and is very fast, but comes with a limit on the
number of terms in a single boolean expression; the number of bits in a
gcov_unsigned_type (which is usually typedef'd to uint64_t). For most
practical purposes this is acceptable, and by default a warning will be
issued if gcc cannot instrument the expression.  This is a practical
limitation in the implementation, not the algorithm, so support for more
conditions can be added by also introducing arbitrary-sized bitsets.

In action it looks pretty similar to the branch coverage. The -g short
opt carries no significance, but was chosen because it was an available
option with the upper-case free too.

gcov --conditions:

 3:   17:void fn (int a, int b, int c, int d) {
 3:   18:if ((a && (b || c)) && d)
conditions covered 3/8
condition  0 not covered (true false)
condition  1 not covered (true)
condition  2 not covered (true)
condition  3 not covered (true)
 1:   19:x = 1;
 -:   20:else
 2:   21:x = 2;
 3:   22:}

gcov --conditions --json-format:

"conditions": [
 {
 "not_covered_false": [
 0
 ],
 "count": 8,
 "covered": 3,
 "not_covered_true": [
 0,
 1,
 2,
 3
 ]
 }
],

Expressions with constants may be heavily rewritten before it reaches
the gimplification, so constructs like int x = a ? 0 : 1 becomes
_x = (_a == 0). From source you would expect coverage, but it gets
neither branch nor condition coverage. The same applies to expressions
like int x = 1 || a which are simply replaced by a constant.

The test suite contains a lot of small programs and functions. Some of
these were designed by hand to test for specific behaviours and graph

Re: [PATCH v1 00/13] Add aarch64-w64-mingw32 target

2024-02-22 Thread Richard Earnshaw (lists)
On 21/02/2024 17:47, Evgeny Karpov wrote:
> Hello,
> 
> We would like to take your attention to the review of changes for the
> new GCC target, aarch64-w64-mingw32. The new target will be
> supported, tested, added to CI, and maintained by Linaro. This marks
> the first of three planned patch series contributing to the GCC C
> compiler's support for Windows Arm64.
> 
> 1. Minimal aarch64-w64-mingw32 C implementation to cross-compile
> hello-world with libgcc for Windows Arm64 using MinGW.
> 2. Extension of the aarch64-w64-mingw32 C implementation to
> cross-compile OpenSSL, OpenBLAS, FFmpeg, and libjpeg-turbo. All
> packages successfully pass tests.
> 3. Addition of call stack support for debugging, resolution of
> optimization issues in the C compiler, and DLL export/import for the
> aarch64-w64-mingw32 target.
> 
> This patch series introduces the 1st point, which involves building
> hello-world for the aarch64-w64-mingw32 target. The patch depends on
> the binutils changes for the aarch64-w64-mingw32 target that have
> already been merged.
> 
> The binutils should include recent relocation fixes.
> f87eaf8ff3995a5888c6dc4996a20c770e6bcd36
> aarch64: Add new relocations and limit COFF AArch64 relocation offsets
> 
> The series is structured in a way to trivially show that it should not
> affect any other targets.
> 
> In this patch, several changes have been made to support the
> aarch64-w64-mingw32 target for GCC. The modifications include the
> definition of the MS ABI for aarch64, adjustments to FIXED_REGISTERS
> and STATIC_CHAIN_REGNUM for different ABIs, and specific definitions
> for COFF format on AArch64. Additionally, the patch reuses MinGW
>  types and definitions from i386, relocating them to a new
> mingw folder for shared usage between both targets.
> 
> MinGW-specific options have been introduced for AArch64, along with
> override options for aarch64-w64-mingw32. Builtin stack probing for
> override options for aarch64-w64-mingw32. Builtin stack probing for
> AArch64 has been enabled as an alternative for chkstk. Symbol name
> encoding and section information handling for aarch64-w64-mingw32 have
> been incorporated, and the MinGW environment has been added, which
> will also be utilized for defining the Cygwin environment in the
> future.
> 
> The patch includes renaming "x86 Windows Options" to "Cygwin and MinGW
> Options," which now encompasses AArch64 as well. AArch64-specific
> Cygwin and MinGW Options have been introduced for the unique
> requirements of the AArch64 architecture.
> 
> Function type declaration and named sections support have been added.
> The necessary objects for Cygwin and MinGW have been built for the
> aarch64-w64-mingw32 target, and relevant files such as msformat-c.cc
> and winnt-d.cc have been moved to the mingw folder for reuse in
> AArch64.
> 
> Furthermore, the aarch64-w64-mingw32 target has been included in both
> libatomic and libgcc, ensuring support for the AArch64 architecture
> within these libraries. These changes collectively enhance the
> capabilities of GCC for the specified target.
> 
> Coauthors: Zac Walker ,
> Mark Harmstone   and
> Ron Riddle 
> 
> Refactored, prepared, and validated by 
> Radek Barton  and 
> Evgeny Karpov 
> 
> Special thanks to the Linaro GNU toolchain team for internal review
> and assistance in preparing the patch series!
> 
> Regards,
> Evgeny

Thanks for posting this.

I've only read quickly through this patch series and responded where I think 
some action is obviously required.  That doesn't necessarily mean the other 
patches are perfect, though, just that nothing immediately caught my attention.

R.

> 
> 
> Zac Walker (13):
>   Introduce aarch64-w64-mingw32 target
>   aarch64: The aarch64-w64-mingw32 target implements the MS ABI
>   aarch64: Mark x18 register as a fixed register for MS ABI
>   aarch64: Add aarch64-w64-mingw32 COFF
>   Reuse MinGW from i386 for AArch64
>   Rename section and encoding functions from i386 which will be used in
> aarch64
>   Exclude i386 functionality from aarch64 build
>   aarch64: Add Cygwin and MinGW environments for AArch64
>   aarch64: Add SEH to machine_function
>   Rename "x86 Windows Options" to "Cygwin and MinGW Options"
>   aarch64: Build and add objects for Cygwin and MinGW for AArch64
>   aarch64: Add aarch64-w64-mingw32 target to libatomic
>   Add aarch64-w64-mingw32 target to libgcc
> 
>  fixincludes/mkfixinc.sh   |   3 +-
>  gcc/config.gcc|  47 +++--
>  gcc/config/aarch64/aarch64-coff.h |  92 +
>  gcc/config/aarch64/aarch64-opts.h |   7 +
>  gcc/config/aarch64/aarch64-protos.h   |   5 +
>  gcc/config/aarch64/aarch64.h  |  25 ++-
>  gcc/config/aarch64/cygming.h  | 178 ++
>  gcc/config/i386/cygming.h |  18 +-
>  gcc/config/i386/cygming.opt.urls  |  30 ---
>  gcc/config/i386/i386-protos.h  

Re: [PATCH v1 13/13] Add aarch64-w64-mingw32 target to libgcc

2024-02-22 Thread Richard Earnshaw (lists)
On 21/02/2024 18:40, Evgeny Karpov wrote:
> 
+aarch64-*-mingw*)

This doesn't match the glob pattern you added to config.gcc in an earlier 
patch, but see my comment on that.  The two should really be consistent with 
each other or you might get build failures late on.

R.


Re: [PATCH v1 10/13] Rename "x86 Windows Options" to "Cygwin and MinGW Options"

2024-02-22 Thread Richard Earnshaw (lists)
On 21/02/2024 18:38, Evgeny Karpov wrote:
> 
For this change you might want to put some form of re-direct in the manual 
under the old name so that anybody used to looking for the old entry will know 
where things have been moved to.  Something like

x86 Windows Options
  See xref(Cygwin and MinGW Options).

R.


Re: [PATCH v9 2/2] Add gcov MC/DC tests for GDC

2024-02-22 Thread Jan Hubicka
> This is a mostly straight port from the gcov-19.c tests from the C test
> suite. The only notable differences from C to D are that D flips the
> true/false outcomes for loop headers, and the D front end ties loop and
> ternary conditions to slightly different locus.
> 
> The test for >64 conditions warning is disabled as it either needs
> support from the testing framework or a something similar to #pragma GCC
> diagnostic push to not cause a test failure from detecting a warning.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gdc.dg/gcov.exp: New test.
>   * gdc.dg/gcov1.d: New test.

I never wrote anyting in D, so I would preffer Iain to take a look, but
the transition seems direct enough so I think the patch is OK.

Honza


Re: [PATCH v9 1/2] Add condition coverage (MC/DC)

2024-02-22 Thread Jan Hubicka
Hello,
> This patch adds support in gcc+gcov for modified condition/decision
> coverage (MC/DC) with the -fcondition-coverage flag. MC/DC is a type of
> test/code coverage and it is particularly important for safety-critical
> applicaitons in industries like aviation and automotive. Notably, MC/DC
> is required or recommended by:
> 
> * DO-178C for the most critical software (Level A) in avionics.
> * IEC 61508 for SIL 4.
> * ISO 26262-6 for ASIL D.
> 
> From the SQLite webpage:
> 
> Two methods of measuring test coverage were described above:
> "statement" and "branch" coverage. There are many other test
> coverage metrics besides these two. Another popular metric is
> "Modified Condition/Decision Coverage" or MC/DC. Wikipedia defines
> MC/DC as follows:
> 
> * Each decision tries every possible outcome.
> * Each condition in a decision takes on every possible outcome.
> * Each entry and exit point is invoked.
> * Each condition in a decision is shown to independently affect
>   the outcome of the decision.
> 
> In the C programming language where && and || are "short-circuit"
> operators, MC/DC and branch coverage are very nearly the same thing.
> The primary difference is in boolean vector tests. One can test for
> any of several bits in bit-vector and still obtain 100% branch test
> coverage even though the second element of MC/DC - the requirement
> that each condition in a decision take on every possible outcome -
> might not be satisfied.
> 
> https://sqlite.org/testing.html#mcdc
> 
> MC/DC comes in different flavours, the most important being unique
> cause MC/DC and masking MC/DC - this patch implements masking MC/DC,
> which is works well with short circuiting semantics, and according to
> John Chilenski's "An Investigation of Three Forms of the Modified
> Condition Decision Coverage (MCDC) Criterion" (2001) is as good as
> unique cause at catching bugs.
> 
> Whalen, Heimdahl, and De Silva "Efficient Test Coverage Measurement for
> MC/DC" describes an algorithm for determining masking from an AST walk,
> but my algorithm figures this out from analyzing the control flow graph.
> The CFG is considered a binary decision diagram and an evaluation
> becomes a path through the BDD, which is recorded. Certain paths will
> mask ("null out") the contribution from earlier path segments, which can
> be determined by finding short circuit endpoints. Masking is the
> short circuiting of terms in the reverse-ordered Boolean function, and
> the masked terms do not affect the decision like short-circuited
> terms do not affect the decision.
> 
> A tag/discriminator mapping from gcond->uid is created during
> gimplification and made available through the function struct. The
> values are unimportant as long as basic conditions constructed from a
> single Boolean expression are given the same identifier. This happens in
> the breaking down of ANDIF/ORIF trees, so the coverage generally works
> well for frontends that create such trees.
> 
> Like Whalen et al this implementation records coverage in fixed-size
> bitsets which gcov knows how to interpret. This takes only a few bitwise
> operations per condition and is very fast, but comes with a limit on the
> number of terms in a single boolean expression; the number of bits in a
> gcov_unsigned_type (which is usually typedef'd to uint64_t). For most
> practical purposes this is acceptable, and by default a warning will be
> issued if gcc cannot instrument the expression.  This is a practical
> limitation in the implementation, not the algorithm, so support for more
> conditions can be added by also introducing arbitrary-sized bitsets.
> 
> In action it looks pretty similar to the branch coverage. The -g short
> opt carries no significance, but was chosen because it was an available
> option with the upper-case free too.
> 
> gcov --conditions:
> 
> 3:   17:void fn (int a, int b, int c, int d) {
> 3:   18:if ((a && (b || c)) && d)
> conditions covered 3/8
> condition  0 not covered (true false)
> condition  1 not covered (true)
> condition  2 not covered (true)
> condition  3 not covered (true)
> 1:   19:x = 1;
> -:   20:else
> 2:   21:x = 2;
> 3:   22:}
> 
> gcov --conditions --json-format:
> 
> "conditions": [
> {
> "not_covered_false": [
> 0
> ],
> "count": 8,
> "covered": 3,
> "not_covered_true": [
> 0,
> 1,
> 2,
> 3
> ]
> }
> ],
> 
> Expressions with constants may be heavily rewritten before it reaches
> the gimplification, so constructs like int x = a ? 0 : 1 becomes
> _x = (_a == 0). From source you would expect coverage, but it gets
> neither branch nor condition coverage. The same applies to expressions
> like int x = 1 || a which are simply replaced by a constant.
> 
> 

Re: [PATCH v1 08/13] aarch64: Add Cygwin and MinGW environments for AArch64

2024-02-22 Thread Richard Earnshaw (lists)
On 21/02/2024 18:36, Evgeny Karpov wrote:
> 
+/* GNU as supports weak symbols on PECOFF.  */
+#ifdef HAVE_GAS_WEAK

Can't we assume this is true?  It was most likely needed on i386 because 
support goes back longer than the assembler had this feature, but it looks like 
it was added in 2000, or thereabouts, so significantly before aarch64 was 
supported in the assembler.

+#ifndef HAVE_GAS_ALIGNED_COMM

And this was added to GCC in 2009, which probably means it predates 
aarch64-coff support in gas as well.

R.


Re: [PATCH v1 03/13] aarch64: Mark x18 register as a fixed register for MS ABI

2024-02-22 Thread Richard Earnshaw (lists)
On 21/02/2024 18:30, Evgeny Karpov wrote:
> 
+   tm_defines="${tm_defines} TARGET_ARM64_MS_ABI=1"

I missed this on first reading...

The GCC port name uses AARCH64, please use that internally rather than other 
names.  The only time when we should be using ARM64 is when it's needed for 
compatibility with other compilers and that doesn't apply here AFAICT.

R.


Re: [PATCH] profile-count: Don't dump through a temporary buffer [PR111960]

2024-02-22 Thread Jan Hubicka
> Hi!
> 
> The profile_count::dump (char *, struct function * = NULL) const;
> method has a single caller, the
> profile_count::dump (FILE *f, struct function *fun) const;
> method and for that going through a temporary buffer is just slower
> and opens doors for buffer overflows, which is exactly why this P1
> was filed.
> The buffer size is 64 bytes, the previous maximum
> "%" PRId64 " (%s)"
> would print up to 61 bytes in there (19 bytes for arbitrary uint64_t:61
> bitfield printed as signed, "estimated locally, globally 0 adjusted"
> i.e. 38 bytes longest %s and 4 other characters).
> Now, after the r14-2389 changes, it can be
> 19 + 38 plus 11 other characters + %.4f, which is worst case
> 309 chars before decimal point, decimal point and 4 digits after it,
> so total 382 bytes.
> 
> So, either we could bump the buffer[64] to buffer[400], or the following
> patch just drops the indirection through buffer and prints it directly to
> stream.  After all, having APIs which fill in some buffer without passing
> down the size of the buffer is just asking for buffer overflows over time.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Thanks for fixing it!
I believe the original reason why dump had the buffer argument was
pretty printing, which we do differently now.  Fully agree that the
fixed size buffer is ugly.

Honza


[PATCH] tree-optimization/114027 - conditional reduction chain

2024-02-22 Thread Richard Biener
When we classify a conditional reduction chain as CONST_COND_REDUCTION
we fail to verify all involved conditionals have the same constant.
That's a quite unlikely situation so the following simply disables
such classification when there's more than one reduction statement.

Bootstrapped and tested on x86_64-unkown-linux-gnu, pushed.

Richard.

PR tree-optimization/114027
* tree-vect-loop.cc (vecctorizable_reduction): Use optimized
condition reduction classification only for single-element
chains.

* gcc.dg/vect/pr114027.c: New testcase.
---
 gcc/testsuite/gcc.dg/vect/pr114027.c | 26 ++
 gcc/tree-vect-loop.cc| 11 ++-
 2 files changed, 32 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/pr114027.c

diff --git a/gcc/testsuite/gcc.dg/vect/pr114027.c 
b/gcc/testsuite/gcc.dg/vect/pr114027.c
new file mode 100644
index 000..ead9cdd982d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr114027.c
@@ -0,0 +1,26 @@
+#include "tree-vect.h"
+
+int __attribute__((noipa))
+foo (int *f, int n)
+{
+  int res = 0;
+  for (int i = 0; i < n; ++i)
+{
+  if (f[2*i])
+res = 2;
+  if (f[2*i+1])
+res = -2;
+}
+  return res;
+}
+
+int f[] = { 1, 1, 1, 1, 1, 1, 1, 1,
+1, 1, 1, 1, 1, 1, 1, 0 };
+
+int
+main ()
+{
+  if (foo (f, 16) != 2)
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index 5a5865c42fc..35f1f8c7d42 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -7759,17 +7759,18 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
  < GET_MODE_SIZE (SCALAR_TYPE_MODE (TREE_TYPE 
(vectype_op[i]))
vectype_in = vectype_op[i];
 
-  if (op.code == COND_EXPR)
+  /* Record how the non-reduction-def value of COND_EXPR is defined.
+???  For a chain of multiple CONDs we'd have to match them up all.  */
+  if (op.code == COND_EXPR && reduc_chain_length == 1)
{
- /* Record how the non-reduction-def value of COND_EXPR is defined.  */
  if (dt == vect_constant_def)
{
  cond_reduc_dt = dt;
  cond_reduc_val = op.ops[i];
}
- if (dt == vect_induction_def
- && def_stmt_info
- && is_nonwrapping_integer_induction (def_stmt_info, loop))
+ else if (dt == vect_induction_def
+  && def_stmt_info
+  && is_nonwrapping_integer_induction (def_stmt_info, loop))
{
  cond_reduc_dt = dt;
  cond_stmt_vinfo = def_stmt_info;
-- 
2.35.3


Re: [PATCH v1 03/13] aarch64: Mark x18 register as a fixed register for MS ABI

2024-02-22 Thread Richard Earnshaw (lists)
On 21/02/2024 18:30, Evgeny Karpov wrote:
> 
+/* X18 reserved for the TEB on Windows.  */
+#ifdef TARGET_ARM64_MS_ABI
+# define FIXED_X18 1
+# define CALL_USED_X18 0
+#else
+# define FIXED_X18 0
+# define CALL_USED_X18 1
+#endif

I'm not overly keen on ifdefs like this (and the one below), it can get quite 
confusing if we have to support more than a couple of ABIs.  Perhaps we could 
create a couple of new headers, one for the EABI (which all existing targets 
would then need to include) and one for the MS ABI.  Then the mingw port would 
use that instead of the EABI header.

An alternative is to make all this dynamic, based on the setting of the 
aarch64_calling_abi enum and to make the adjustments in 
aarch64_conditional_register_usage.

+# define CALL_USED_X18 0

Is that really correct?  If the register is really reserved, but some code 
modifies it anyway, this will cause the compiler to restore the old value at 
the end of a function; generally, for a reserved register, code that knows what 
it's doing would want to make permanent changes to this value.

+#ifdef TARGET_ARM64_MS_ABI
+# define STATIC_CHAIN_REGNUM   R17_REGNUM
+#else
+# define STATIC_CHAIN_REGNUM   R18_REGNUM
+#endif

If we went the enum way, we'd want something like

#define STATIC_CHAIN_REGNUM (calling_abi == AARCH64_CALLING_ABI_MS ? R17_REGNUM 
: R18_REGNUM)

R.


Re: [PATCH v1 02/13] aarch64: The aarch64-w64-mingw32 target implements

2024-02-22 Thread Richard Earnshaw (lists)
On 21/02/2024 18:26, Evgeny Karpov wrote:
> 
+/* Available call ABIs.  */
+enum calling_abi
+{
+  AARCH64_EABI = 0,
+  MS_ABI = 1
+};
+

The convention in this file seems to be that all enum types to start with 
aarch64.  Also, the enumeration values should start with the name of the 
enumeration type in upper case, so:

enum aarch64_calling_abi
{
  AARCH64_CALLING_ABI_EABI,
  AARCH64_CALLING_ABI_MS
};

or something very much like that.

R.


Re: [PATCH v1 01/13] Introduce aarch64-w64-mingw32 target

2024-02-22 Thread Richard Earnshaw (lists)
On 21/02/2024 18:16, Evgeny Karpov wrote:
> 
+aarch64*-*-mingw*)

Other targets are a bit inconsistent here as well, but, as Andrew mentioned, if 
you don't want to handle big-endian, it might be better to match 
aarch64-*-mingw* here.

R.


Re: [PATCH v1 05/13] Reuse MinGW from i386 for AArch64

2024-02-22 Thread Richard Earnshaw (lists)
On 21/02/2024 21:34, rep.dot@gmail.com wrote:
> On 21 February 2024 19:34:43 CET, Evgeny Karpov  
> wrote:
>>
> 
> Please use git send-email. Your mail ends up as empty as here, otherwise.

I don't see anything wrong with it; niether does patchwork 
(https://patchwork.sourceware.org/project/gcc/list/?series=31191) nor does the 
Linaro CI bot.  So perhaps it's your mailer that's misconfigured.

> 
> The ChangeLog has to be expressed in present tense, as mandated by the 
> standard; s/Moved/Move/g etc.

Agreed, but that's a detail that we can get to once the patch has been properly 
reviewed.

> 
> In any sane world ( and in gcc ) to fold, respectively a folder, is something 
> else compared to a directory ( which you probably mean when moving a file 
> from one directory to another directory as you seem to do ).
> 
> Most of the free world has left COFF behind since several decades, so I won't 
> comment on that. YMMV.

This isn't helpful.  Windows platforms use (a derivative of) COFF, so that's 
what the tools need to use when targetting that platform.

R.



Re: [GCC 13 PATCH] LoongArch: Don't default to -mno-explicit-relocs if -mno-relax

2024-02-22 Thread chenglulu



在 2024/2/22 下午6:20, Xi Ruoyao 写道:

To improve Binutils compatibility we've had to backported relaxation
support.  But if a user just updates to GCC 13.3 and sticks with
Binutils 2.41, there is no reason to use -mno-explicit-relocs as the
default because we are turning off relaxation for Binutils 2.41 (it
lacks conditional branch relaxation support) anyway.

So like GCC 14, make the default of -m[no-]explicit-relocs depend on
-m[no-]relax instead of HAVE_AS_MRELAX_OPTION.  Also update the doc to
reflect the behavior change.

gcc/ChangeLog:

* config/loongarch/genopts/loongarch.opt.in
(TARGET_EXPLICIT_RELOCS): Init to M_OPTION_NOT_SEEN.
* config/loongarch/loongarch.opt: Regenerate.
* config/loongarch/loongarch.cc
(loongarch_option_override_internal): Set the default of
TARGET_EXPLICIT_RELOCS to HAVE_AS_EXPLICIT_RELOCS
&& !loongarch_mrelax.
* doc/invoke.texi (-m[no-]explicit-relocs): Update for
LoongArch.
---

Ok for releases/gcc-13?


LGTM!

Thanks!



  gcc/config/loongarch/genopts/loongarch.opt.in |  2 +-
  gcc/config/loongarch/loongarch.cc |  4 
  gcc/config/loongarch/loongarch.opt|  2 +-
  gcc/doc/invoke.texi   | 11 +--
  4 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/gcc/config/loongarch/genopts/loongarch.opt.in 
b/gcc/config/loongarch/genopts/loongarch.opt.in
index da6fedd153e..76acd35d39c 100644
--- a/gcc/config/loongarch/genopts/loongarch.opt.in
+++ b/gcc/config/loongarch/genopts/loongarch.opt.in
@@ -155,7 +155,7 @@ Target Joined RejectNegative UInteger 
Var(loongarch_max_inline_memcpy_size) Init
  -mmax-inline-memcpy-size=SIZE Set the max size of memcpy to inline, default 
is 1024.
  
  mexplicit-relocs

-Target Var(TARGET_EXPLICIT_RELOCS) Init(HAVE_AS_EXPLICIT_RELOCS & 
!HAVE_AS_MRELAX_OPTION)
+Target Var(TARGET_EXPLICIT_RELOCS) Init(M_OPTION_NOT_SEEN)
  Use %reloc() assembly operators.
  
  ; The code model option names for -mcmodel.

diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index 768e2427285..e78b81cd8fc 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -6222,6 +6222,10 @@ loongarch_option_override_internal (struct gcc_options 
*opts)
gcc_unreachable ();
  }
  
+  if (TARGET_EXPLICIT_RELOCS == M_OPTION_NOT_SEEN)

+TARGET_EXPLICIT_RELOCS = (HAVE_AS_EXPLICIT_RELOCS
+ && !loongarch_mrelax);
+
/* Validate the guard size.  */
int guard_size = param_stack_clash_protection_guard_size;
  
diff --git a/gcc/config/loongarch/loongarch.opt b/gcc/config/loongarch/loongarch.opt

index 59b1e06d3f2..e61fbaed2c1 100644
--- a/gcc/config/loongarch/loongarch.opt
+++ b/gcc/config/loongarch/loongarch.opt
@@ -162,7 +162,7 @@ Target Joined RejectNegative UInteger 
Var(loongarch_max_inline_memcpy_size) Init
  -mmax-inline-memcpy-size=SIZE Set the max size of memcpy to inline, default 
is 1024.
  
  mexplicit-relocs

-Target Var(TARGET_EXPLICIT_RELOCS) Init(HAVE_AS_EXPLICIT_RELOCS & 
!HAVE_AS_MRELAX_OPTION)
+Target Var(TARGET_EXPLICIT_RELOCS) Init(M_OPTION_NOT_SEEN)
  Use %reloc() assembly operators.
  
  ; The code model option names for -mcmodel.

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 99657fb44d8..792ce283bb9 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -25830,12 +25830,11 @@ The default code model is @code{normal}.
  @itemx -mno-explicit-relocs
  Use or do not use assembler relocation operators when dealing with symbolic
  addresses.  The alternative is to use assembler macros instead, which may
-limit optimization.  The default value for the option is determined during
-GCC build-time by detecting corresponding assembler support:
-@code{-mexplicit-relocs} if said support is present,
-@code{-mno-explicit-relocs} otherwise.  This option is mostly useful for
-debugging, or interoperation with assemblers different from the build-time
-one.
+limit instruction scheduling but allow linker relaxation.  The default
+value for the option is determined with the assembler capability detected
+during GCC build-time and the setting of @code{-mrelax}:
+@code{-mexplicit-relocs} if the assembler supports relocation operators
+but @code{-mrelax} is not enabled, @code{-mno-explicit-relocs} otherwise.
  
  @opindex mdirect-extern-access

  @item -mdirect-extern-access




[GCC 13 PATCH] LoongArch: Don't default to -mno-explicit-relocs if -mno-relax

2024-02-22 Thread Xi Ruoyao
To improve Binutils compatibility we've had to backported relaxation
support.  But if a user just updates to GCC 13.3 and sticks with
Binutils 2.41, there is no reason to use -mno-explicit-relocs as the
default because we are turning off relaxation for Binutils 2.41 (it
lacks conditional branch relaxation support) anyway.

So like GCC 14, make the default of -m[no-]explicit-relocs depend on
-m[no-]relax instead of HAVE_AS_MRELAX_OPTION.  Also update the doc to
reflect the behavior change.

gcc/ChangeLog:

* config/loongarch/genopts/loongarch.opt.in
(TARGET_EXPLICIT_RELOCS): Init to M_OPTION_NOT_SEEN.
* config/loongarch/loongarch.opt: Regenerate.
* config/loongarch/loongarch.cc
(loongarch_option_override_internal): Set the default of
TARGET_EXPLICIT_RELOCS to HAVE_AS_EXPLICIT_RELOCS
&& !loongarch_mrelax.
* doc/invoke.texi (-m[no-]explicit-relocs): Update for
LoongArch.
---

Ok for releases/gcc-13?

 gcc/config/loongarch/genopts/loongarch.opt.in |  2 +-
 gcc/config/loongarch/loongarch.cc |  4 
 gcc/config/loongarch/loongarch.opt|  2 +-
 gcc/doc/invoke.texi   | 11 +--
 4 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/gcc/config/loongarch/genopts/loongarch.opt.in 
b/gcc/config/loongarch/genopts/loongarch.opt.in
index da6fedd153e..76acd35d39c 100644
--- a/gcc/config/loongarch/genopts/loongarch.opt.in
+++ b/gcc/config/loongarch/genopts/loongarch.opt.in
@@ -155,7 +155,7 @@ Target Joined RejectNegative UInteger 
Var(loongarch_max_inline_memcpy_size) Init
 -mmax-inline-memcpy-size=SIZE  Set the max size of memcpy to inline, default 
is 1024.
 
 mexplicit-relocs
-Target Var(TARGET_EXPLICIT_RELOCS) Init(HAVE_AS_EXPLICIT_RELOCS & 
!HAVE_AS_MRELAX_OPTION)
+Target Var(TARGET_EXPLICIT_RELOCS) Init(M_OPTION_NOT_SEEN)
 Use %reloc() assembly operators.
 
 ; The code model option names for -mcmodel.
diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index 768e2427285..e78b81cd8fc 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -6222,6 +6222,10 @@ loongarch_option_override_internal (struct gcc_options 
*opts)
gcc_unreachable ();
 }
 
+  if (TARGET_EXPLICIT_RELOCS == M_OPTION_NOT_SEEN)
+TARGET_EXPLICIT_RELOCS = (HAVE_AS_EXPLICIT_RELOCS
+ && !loongarch_mrelax);
+
   /* Validate the guard size.  */
   int guard_size = param_stack_clash_protection_guard_size;
 
diff --git a/gcc/config/loongarch/loongarch.opt 
b/gcc/config/loongarch/loongarch.opt
index 59b1e06d3f2..e61fbaed2c1 100644
--- a/gcc/config/loongarch/loongarch.opt
+++ b/gcc/config/loongarch/loongarch.opt
@@ -162,7 +162,7 @@ Target Joined RejectNegative UInteger 
Var(loongarch_max_inline_memcpy_size) Init
 -mmax-inline-memcpy-size=SIZE  Set the max size of memcpy to inline, default 
is 1024.
 
 mexplicit-relocs
-Target Var(TARGET_EXPLICIT_RELOCS) Init(HAVE_AS_EXPLICIT_RELOCS & 
!HAVE_AS_MRELAX_OPTION)
+Target Var(TARGET_EXPLICIT_RELOCS) Init(M_OPTION_NOT_SEEN)
 Use %reloc() assembly operators.
 
 ; The code model option names for -mcmodel.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 99657fb44d8..792ce283bb9 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -25830,12 +25830,11 @@ The default code model is @code{normal}.
 @itemx -mno-explicit-relocs
 Use or do not use assembler relocation operators when dealing with symbolic
 addresses.  The alternative is to use assembler macros instead, which may
-limit optimization.  The default value for the option is determined during
-GCC build-time by detecting corresponding assembler support:
-@code{-mexplicit-relocs} if said support is present,
-@code{-mno-explicit-relocs} otherwise.  This option is mostly useful for
-debugging, or interoperation with assemblers different from the build-time
-one.
+limit instruction scheduling but allow linker relaxation.  The default
+value for the option is determined with the assembler capability detected
+during GCC build-time and the setting of @code{-mrelax}:
+@code{-mexplicit-relocs} if the assembler supports relocation operators
+but @code{-mrelax} is not enabled, @code{-mno-explicit-relocs} otherwise.
 
 @opindex mdirect-extern-access
 @item -mdirect-extern-access
-- 
2.43.2



Re: [PATCH] profile-count: Don't dump through a temporary buffer [PR111960]

2024-02-22 Thread Richard Biener
On Thu, Feb 22, 2024 at 10:07 AM Jakub Jelinek  wrote:
>
> Hi!
>
> The profile_count::dump (char *, struct function * = NULL) const;
> method has a single caller, the
> profile_count::dump (FILE *f, struct function *fun) const;
> method and for that going through a temporary buffer is just slower
> and opens doors for buffer overflows, which is exactly why this P1
> was filed.
> The buffer size is 64 bytes, the previous maximum
> "%" PRId64 " (%s)"
> would print up to 61 bytes in there (19 bytes for arbitrary uint64_t:61
> bitfield printed as signed, "estimated locally, globally 0 adjusted"
> i.e. 38 bytes longest %s and 4 other characters).
> Now, after the r14-2389 changes, it can be
> 19 + 38 plus 11 other characters + %.4f, which is worst case
> 309 chars before decimal point, decimal point and 4 digits after it,
> so total 382 bytes.
>
> So, either we could bump the buffer[64] to buffer[400], or the following
> patch just drops the indirection through buffer and prints it directly to
> stream.  After all, having APIs which fill in some buffer without passing
> down the size of the buffer is just asking for buffer overflows over time.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Richard.

> Or do you want buffer[400]; instead?  Or buffer[128]; and somehow prevent
> arbitrarily long doubles?  Or add size_t next to char * arguments and use
> snprintf?  Though, truncated messages would look ugly.
>
> 2024-02-22  Jakub Jelinek  
>
> PR ipa/111960
> * profile-count.h (profile_count::dump): Remove overload with
> char * first argument.
> * profile-count.cc (profile_count::dump): Change overload with char *
> first argument which uses sprintf into the overfload with FILE *
> first argument and use fprintf instead.  Remove overload which wrapped
> it.
>
> --- gcc/profile-count.h.jj  2024-01-03 11:51:30.309748150 +0100
> +++ gcc/profile-count.h 2024-02-21 21:04:22.338905728 +0100
> @@ -1299,9 +1299,6 @@ public:
>/* Output THIS to F.  */
>void dump (FILE *f, struct function *fun = NULL) const;
>
> -  /* Output THIS to BUFFER.  */
> -  void dump (char *buffer, struct function *fun = NULL) const;
> -
>/* Print THIS to stderr.  */
>void debug () const;
>
> --- gcc/profile-count.cc.jj 2024-01-03 11:51:40.782602796 +0100
> +++ gcc/profile-count.cc2024-02-21 21:05:28.521994913 +0100
> @@ -84,34 +84,24 @@ const char *profile_quality_display_name
>"precise"
>  };
>
> -/* Dump THIS to BUFFER.  */
> +/* Dump THIS to F.  */
>
>  void
> -profile_count::dump (char *buffer, struct function *fun) const
> +profile_count::dump (FILE *f, struct function *fun) const
>  {
>if (!initialized_p ())
> -sprintf (buffer, "uninitialized");
> +fprintf (f, "uninitialized");
>else if (fun && initialized_p ()
>&& fun->cfg
>&& ENTRY_BLOCK_PTR_FOR_FN (fun)->count.initialized_p ())
> -sprintf (buffer, "%" PRId64 " (%s, freq %.4f)", m_val,
> +fprintf (f, "%" PRId64 " (%s, freq %.4f)", m_val,
>  profile_quality_display_names[m_quality],
>  to_sreal_scale (ENTRY_BLOCK_PTR_FOR_FN (fun)->count).to_double 
> ());
>else
> -sprintf (buffer, "%" PRId64 " (%s)", m_val,
> +fprintf (f, "%" PRId64 " (%s)", m_val,
>  profile_quality_display_names[m_quality]);
>  }
>
> -/* Dump THIS to F.  */
> -
> -void
> -profile_count::dump (FILE *f, struct function *fun) const
> -{
> -  char buffer[64];
> -  dump (buffer, fun);
> -  fputs (buffer, f);
> -}
> -
>  /* Dump THIS to stderr.  */
>
>  void
>
> Jakub
>


[PATCH] LoongArch: Don't falsely claim gold supported in toplevel configure

2024-02-22 Thread Xi Ruoyao
The gold linker has never been ported to LoongArch (and it seems
unlikely to be ported in the future as the new architectures are
focusing on lld and/or mold for fast linkers).

ChangeLog:

* configure.ac (ENABLE_GOLD): Remove loongarch*-*-* from target
list.
* configure: Regenerate.
---

Ok for GCC trunk (to get synced into Binutils later)?

 configure| 2 +-
 configure.ac | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/configure b/configure
index 874966fb9f0..02b435c1163 100755
--- a/configure
+++ b/configure
@@ -3092,7 +3092,7 @@ case "${ENABLE_GOLD}" in
   # Check for target supported by gold.
   case "${target}" in
 i?86-*-* | x86_64-*-* | sparc*-*-* | powerpc*-*-* | arm*-*-* \
-| aarch64*-*-* | tilegx*-*-* | mips*-*-* | s390*-*-* | loongarch*-*-*)
+| aarch64*-*-* | tilegx*-*-* | mips*-*-* | s390*-*-*)
  configdirs="$configdirs gold"
  if test x${ENABLE_GOLD} = xdefault; then
default_ld=gold
diff --git a/configure.ac b/configure.ac
index 4f34004a072..1a19c07a27b 100644
--- a/configure.ac
+++ b/configure.ac
@@ -364,7 +364,7 @@ case "${ENABLE_GOLD}" in
   # Check for target supported by gold.
   case "${target}" in
 i?86-*-* | x86_64-*-* | sparc*-*-* | powerpc*-*-* | arm*-*-* \
-| aarch64*-*-* | tilegx*-*-* | mips*-*-* | s390*-*-* | loongarch*-*-*)
+| aarch64*-*-* | tilegx*-*-* | mips*-*-* | s390*-*-*)
  configdirs="$configdirs gold"
  if test x${ENABLE_GOLD} = xdefault; then
default_ld=gold
-- 
2.43.2



Re: [PATCH] call-cdce: Add missing BUILT_IN_*F{32,64}X handling and improve BUILT_IN_*L [PR113993]

2024-02-22 Thread Richard Biener
On Wed, 21 Feb 2024, Jakub Jelinek wrote:

> Hi!
> 
> The following testcase ICEs, because can_test_argument_range
> returns true for BUILT_IN_{COSH,SINH,EXP{,M1,2}}{F32X,F64X}
> among many other builtins, but get_no_error_domain doesn't handle
> those.
> 
> float32x_type_node when supported in GCC always has DFmode, so that
> case is easy (and call-cdce assumes that SFmode is IEEE float and DFmode
> is IEEE double).  So *F32X is simply handled by adding those cases
> next to *F64.
> float64x_type_node when supported in GCC by definition has a mode
> with larger precision and exponent range than DFmode, so it can be XFmode,
> TFmode or KFmode.  I went through all the l/f128 suffixed builtins and
> verified that the float128x_type_node no error domain range is actually
> identical to the Intel extended long double no error domain range; it isn't
> that surprising, both IEEE quad and Intel/Motorola extended have the same
> exponent range [-16381, 16384] (well, Motorola -16382 probably because of
> different behavior for denormals, but that has nothing to do with
> get_no_error_domain which is about large inputs overflowing into +-Inf
> or triggering NaN, denormals could in theory do something solely for sqrt
> and even that is fine).  In theory some target could have different larger
> type, so for *F64X the code verifies that
> REAL_MODE_FORMAT (TYPE_MODE (float64x_type_node))->emax == 16384
> and if so, uses the *F128 domains, otherwise falls back to the non-suffixed
> ones (aka *F64), that is certainly the conservative minimum.
> While at it, the patch also changes the *L suffixed cases to do pretty much
> the same, the comment said that the function just assumes for *L
> the *F64 ranges, but that is unnecessarily conservative.
> All we currently have for long double is:
> 1) IEEE quad (emax 16384, *F128 ranges)
> 2) XFmode Intel/Motorola extended (emax 16384, same as *F128 ranges)
> 3) IBM extended (double double, emax 1024, the extra precision doesn't
>really help and the domains are the same as for *F64)
> 4) same as double (*F64 again)
> So, the patch uses also for *L
> REAL_MODE_FORMAT (TYPE_MODE (long_double_type_node))->emax == 16384
> checks and either tail recurses into the *F128 case for that or to
> non-suffixed (aka *F64) case otherwise.
> BUILT_IN_*F128X not handled because no target has those and it doesn't
> seem something is on the horizon and who knows what would be used for that.
> Thus, all we get this wrong for are probably VAX floats or something
> similar, no intent from me to look at that, that is preexisting issue.
> 
> BTW, I'm surprised we don't have BUILT_IN_EXP10F{16,32,64,128,32X,64X,128X}
> builtins, seems glibc has those (sure, I think except *16 and *128x).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Thanks,
Richard.

> 2024-02-21  Jakub Jelinek  
> 
>   PR tree-optimization/113993
>   * tree-call-cdce.cc (get_no_error_domain): Handle
>   BUILT_IN_{COSH,SINH,EXP{,M1,2}}{F32X,F64X}.  Handle
>   BUILT_IN_{COSH,SINH,EXP{,M1,2}}L for
>   REAL_MODE_FORMAT (TYPE_MODE (long_double_type_node))->emax == 16384
>   the as the F128 suffixed cases, otherwise as non-suffixed ones.
>   Handle BUILT_IN_{EXP,POW}10L for
>   REAL_MODE_FORMAT (TYPE_MODE (long_double_type_node))->emax == 16384
>   as (-inf, 4932).
> 
>   * gcc.dg/tree-ssa/pr113993.c: New test.
> 
> --- gcc/tree-call-cdce.cc.jj  2024-01-03 11:51:37.654646209 +0100
> +++ gcc/tree-call-cdce.cc 2024-02-20 09:19:24.432837856 +0100
> @@ -677,14 +677,14 @@ gen_conditions_for_pow (gcall *pow_call,
> Since IEEE only sets minimum requirements for long double format,
> different long double formats exist under different implementations
> (e.g, 64 bit double precision (DF), 80 bit double-extended
> -   precision (XF), and 128 bit quad precision (QF) ).  For simplicity,
> +   precision (XF), and 128 bit quad precision (TF) ).  For simplicity,
> in this implementation, the computed bounds for long double assume
> -   64 bit format (DF), and are therefore conservative.  Another
> -   assumption is that single precision float type is always SF mode,
> -   and double type is DF mode.  This function is quite
> -   implementation specific, so it may not be suitable to be part of
> -   builtins.cc.  This needs to be revisited later to see if it can
> -   be leveraged in x87 assembly expansion.  */
> +   64 bit format (DF) except when it is IEEE quad or extended with the same
> +   emax, and are therefore sometimes conservative.  Another assumption is
> +   that single precision float type is always SF mode, and double type is DF
> +   mode.  This function is quite implementation specific, so it may not be
> +   suitable to be part of builtins.cc.  This needs to be revisited later
> +   to see if it can be leveraged in x87 assembly expansion.  */
>  
>  static inp_domain
>  get_no_error_domain (enum built_in_function fnc)
> @@ -723,10 +723,10 @@ get

[PATCH] profile-count: Don't dump through a temporary buffer [PR111960]

2024-02-22 Thread Jakub Jelinek
Hi!

The profile_count::dump (char *, struct function * = NULL) const;
method has a single caller, the
profile_count::dump (FILE *f, struct function *fun) const;
method and for that going through a temporary buffer is just slower
and opens doors for buffer overflows, which is exactly why this P1
was filed.
The buffer size is 64 bytes, the previous maximum
"%" PRId64 " (%s)"
would print up to 61 bytes in there (19 bytes for arbitrary uint64_t:61
bitfield printed as signed, "estimated locally, globally 0 adjusted"
i.e. 38 bytes longest %s and 4 other characters).
Now, after the r14-2389 changes, it can be
19 + 38 plus 11 other characters + %.4f, which is worst case
309 chars before decimal point, decimal point and 4 digits after it,
so total 382 bytes.

So, either we could bump the buffer[64] to buffer[400], or the following
patch just drops the indirection through buffer and prints it directly to
stream.  After all, having APIs which fill in some buffer without passing
down the size of the buffer is just asking for buffer overflows over time.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Or do you want buffer[400]; instead?  Or buffer[128]; and somehow prevent
arbitrarily long doubles?  Or add size_t next to char * arguments and use
snprintf?  Though, truncated messages would look ugly.

2024-02-22  Jakub Jelinek  

PR ipa/111960
* profile-count.h (profile_count::dump): Remove overload with
char * first argument.
* profile-count.cc (profile_count::dump): Change overload with char *
first argument which uses sprintf into the overfload with FILE *
first argument and use fprintf instead.  Remove overload which wrapped
it.

--- gcc/profile-count.h.jj  2024-01-03 11:51:30.309748150 +0100
+++ gcc/profile-count.h 2024-02-21 21:04:22.338905728 +0100
@@ -1299,9 +1299,6 @@ public:
   /* Output THIS to F.  */
   void dump (FILE *f, struct function *fun = NULL) const;
 
-  /* Output THIS to BUFFER.  */
-  void dump (char *buffer, struct function *fun = NULL) const;
-
   /* Print THIS to stderr.  */
   void debug () const;
 
--- gcc/profile-count.cc.jj 2024-01-03 11:51:40.782602796 +0100
+++ gcc/profile-count.cc2024-02-21 21:05:28.521994913 +0100
@@ -84,34 +84,24 @@ const char *profile_quality_display_name
   "precise"
 };
 
-/* Dump THIS to BUFFER.  */
+/* Dump THIS to F.  */
 
 void
-profile_count::dump (char *buffer, struct function *fun) const
+profile_count::dump (FILE *f, struct function *fun) const
 {
   if (!initialized_p ())
-sprintf (buffer, "uninitialized");
+fprintf (f, "uninitialized");
   else if (fun && initialized_p ()
   && fun->cfg
   && ENTRY_BLOCK_PTR_FOR_FN (fun)->count.initialized_p ())
-sprintf (buffer, "%" PRId64 " (%s, freq %.4f)", m_val,
+fprintf (f, "%" PRId64 " (%s, freq %.4f)", m_val,
 profile_quality_display_names[m_quality],
 to_sreal_scale (ENTRY_BLOCK_PTR_FOR_FN (fun)->count).to_double ());
   else
-sprintf (buffer, "%" PRId64 " (%s)", m_val,
+fprintf (f, "%" PRId64 " (%s)", m_val,
 profile_quality_display_names[m_quality]);
 }
 
-/* Dump THIS to F.  */
-
-void
-profile_count::dump (FILE *f, struct function *fun) const
-{
-  char buffer[64];
-  dump (buffer, fun);
-  fputs (buffer, f);
-}
-
 /* Dump THIS to stderr.  */
 
 void

Jakub



Re: [PATCH] libcpp: Improve location for macro names [PR66290]

2024-02-22 Thread Richard Biener
On Tue, Feb 20, 2024 at 3:33 PM Lewis Hyatt  wrote:
>
> On Mon, Feb 19, 2024 at 11:36 PM Alexandre Oliva  wrote:
> >
> > This backport for gcc-13 is the first of two required for the
> > g++.dg/pch/line-map-3.C test to stop hitting a variant of the known
> > problem mentioned in that testcase: on riscv64-elf and riscv32-elf,
> > after restoring the PCH, the location of the macros is mentioned as if
> > they were on line 3 rather than 2, so even the existing xfails fail.  I
> > think this might be too much to backport, and I'm ready to use an xfail
> > instead, but since this would bring more predictability, I thought I'd
> > ask whether you'd find this backport acceptable.
> >
> > Regstrapped on x86_64-linux-gnu, along with other backports, and tested
> > manually on riscv64-elf.  Ok to install?
>
> Sorry that test is causing a problem, I hadn't realized at first that
> the wrong output was target-dependent. I feel like simply deleting
> this test g++.dg/pch/line-map-3.C from GCC 13 branch should be fine
> too, as a safer alternative, if release managers prefer?

Yes please.

Richard.

 It doesn't
> really need to be on the branch, it's only purpose is to remind me to
> fix the underlying issue for GCC 15...
>
> -Lewis


[PING] Re: [PATCH 1/2] c-family: -Waddress-of-packed-member and casts

2024-02-22 Thread Torbjorn SVENSSON

Ping!

Kind regards,
Torbjörn


On 2024-02-07 17:19, Torbjorn SVENSSON wrote:

Hi,

Is it okay to backport b7e4a4c626eeeb32c291d5bbbaa148c5081b6bfd to 
releases/gcc-13?


Without this backport, I see these failures on arm-none-eabi:

FAIL: 
gcc.dg/analyzer/null-deref-pr108251-smp_fetch_ssl_fc_has_early-O2.c 
(test for excess errors)
FAIL: gcc.dg/analyzer/null-deref-pr108251-smp_fetch_ssl_fc_has_early.c 
(test for excess errors)


Kind regards,
Torbjörn

On 2023-12-11 08:28, Richard Biener wrote:

On Wed, Nov 22, 2023 at 11:45 PM Jason Merrill  wrote:


Tested x86_64-pc-linux-gnu, OK for trunk?


OK


-- 8< --

-Waddress-of-packed-member, in addition to the documented warning about
taking the address of a packed member, also warns about casting from
a pointer to a TYPE_PACKED type to a pointer to a type with greater
alignment.

This wrongly warns if the source is a pointer to enum when -fshort-enums
is on, since that is also represented by TYPE_PACKED.

And there's already -Wcast-align to catch casting from pointer to less
aligned type (packed or otherwise) to pointer to more aligned type; even
apart from the enum problem, this seems like a somewhat arbitrary 
subset of

that warning.  Though that isn't currently on by default.

So, this patch removes the undocumented type-based warning from
-Waddress-of-packed-member.  Some of the tests where the warning is
desirable I changed to use -Wcast-align=strict instead.  The ones that
require -Wno-incompatible-pointer-types, I just removed.

gcc/c-family/ChangeLog:

 * c-warn.cc (check_address_or_pointer_of_packed_member):
 Remove warning based on TYPE_PACKED.

gcc/testsuite/ChangeLog:

 * c-c++-common/Waddress-of-packed-member-1.c: Don't expect
 a warning on the cast cases.
 * c-c++-common/pr51628-35.c: Use -Wcast-align=strict.
 * g++.dg/warn/Waddress-of-packed-member3.C: Likewise.
 * gcc.dg/pr88928.c: Likewise.
 * gcc.dg/pr51628-20.c: Removed.
 * gcc.dg/pr51628-21.c: Removed.
 * gcc.dg/pr51628-25.c: Removed.
---
  gcc/c-family/c-warn.cc    | 58 +--
  .../Waddress-of-packed-member-1.c | 12 ++--
  gcc/testsuite/c-c++-common/pr51628-35.c   |  6 +-
  .../g++.dg/warn/Waddress-of-packed-member3.C  |  8 +--
  gcc/testsuite/gcc.dg/pr51628-20.c | 11 
  gcc/testsuite/gcc.dg/pr51628-21.c | 11 
  gcc/testsuite/gcc.dg/pr51628-25.c |  9 ---
  gcc/testsuite/gcc.dg/pr88928.c    |  6 +-
  8 files changed, 19 insertions(+), 102 deletions(-)
  delete mode 100644 gcc/testsuite/gcc.dg/pr51628-20.c
  delete mode 100644 gcc/testsuite/gcc.dg/pr51628-21.c
  delete mode 100644 gcc/testsuite/gcc.dg/pr51628-25.c

diff --git a/gcc/c-family/c-warn.cc b/gcc/c-family/c-warn.cc
index d2938b91043..2a399ba6d14 100644
--- a/gcc/c-family/c-warn.cc
+++ b/gcc/c-family/c-warn.cc
@@ -2991,10 +2991,9 @@ check_alignment_of_packed_member (tree type, 
tree field, bool rvalue)

    return NULL_TREE;
  }

-/* Return struct or union type if the right hand value, RHS:
-   1. Is a pointer value which isn't aligned to a pointer type TYPE.
-   2. Is an address which takes the unaligned address of packed member
-  of struct or union when assigning to TYPE.
+/* Return struct or union type if the right hand value, RHS
+   is an address which takes the unaligned address of packed member
+   of struct or union when assigning to TYPE.
 Otherwise, return NULL_TREE.  */

  static tree
@@ -3021,57 +3020,6 @@ check_address_or_pointer_of_packed_member 
(tree type, tree rhs)


    type = TREE_TYPE (type);

-  if (TREE_CODE (rhs) == PARM_DECL
-  || VAR_P (rhs)
-  || TREE_CODE (rhs) == CALL_EXPR)
-    {
-  tree rhstype = TREE_TYPE (rhs);
-  if (TREE_CODE (rhs) == CALL_EXPR)
-   {
- rhs = CALL_EXPR_FN (rhs); /* Pointer expression.  */
- if (rhs == NULL_TREE)
-   return NULL_TREE;
- rhs = TREE_TYPE (rhs);    /* Pointer type.  */
- /* We could be called while processing a template and RHS 
could be

-    a functor.  In that case it's a class, not a pointer.  */
- if (!rhs || !POINTER_TYPE_P (rhs))
-   return NULL_TREE;
- rhs = TREE_TYPE (rhs);    /* Function type.  */
- rhstype = TREE_TYPE (rhs);
- if (!rhstype || !POINTER_TYPE_P (rhstype))
-   return NULL_TREE;
- rvalue = true;
-   }
-  if (rvalue && POINTER_TYPE_P (rhstype))
-   rhstype = TREE_TYPE (rhstype);
-  while (TREE_CODE (rhstype) == ARRAY_TYPE)
-   rhstype = TREE_TYPE (rhstype);
-  if (TYPE_PACKED (rhstype))
-   {
- unsigned int type_align = min_align_of_type (type);
- unsigned int rhs_align = min_align_of_type (rhstype);
- if (rhs_align < type_align)
-   {
- auto_diagnostic_group d;
- location_t location = EXPR_LOC_OR_LOC (rhs, 
input_location);

-   

Re: [PATCH] bitintlower: Fix .MUL_OVERFLOW overflow checking [PR114038]

2024-02-22 Thread Richard Biener
On Thu, 22 Feb 2024, Jakub Jelinek wrote:

> Hi!
> 
> Currently, bitint_large_huge::lower_mul_overflow uses cnt 1 only if
> startlimb == endlimb and in that case doesn't use a loop and handles
> everything in a special if:
>   unsigned cnt;
>   bool use_loop = false;
>   if (startlimb == endlimb)
> cnt = 1;
>   else if (startlimb + 1 == endlimb)
> cnt = 2;
>   else if ((end % limb_prec) == 0)
> {
>   cnt = 2;
>   use_loop = true;
> }
>   else
> {
>   cnt = 3;
>   use_loop = startlimb + 2 < endlimb;
> }
>   if (cnt == 1)
>   {
> ...
>   }
>   else
> The loop handling for the loop exit condition wants to compare if the
> incremented index is equal to endlimb, but that is correct only if
> end is not divisible by limb_prec and there will be a straight line
> check after the loop as well for the most significant limb.  The code
> used endlimb + (cnt == 1) for that, but cnt == 1 is never true here,
> because cnt is either 2 or 3, so the right check is (cnt == 2).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Thanks,
Richard.

> 2024-02-22  Jakub Jelinek  
> 
>   PR tree-optimization/114038
>   * gimple-lower-bitint.cc (bitint_large_huge::lower_mul_overflow): Fix
>   loop exit condition if end is divisible by limb_prec.
> 
>   * gcc.dg/torture/bitint-59.c: New test.
> 
> --- gcc/gimple-lower-bitint.cc.jj 2024-02-15 09:52:40.999145971 +0100
> +++ gcc/gimple-lower-bitint.cc2024-02-21 20:04:27.590388930 +0100
> @@ -4497,7 +4497,7 @@ bitint_large_huge::lower_mul_overflow (t
>  size_one_node);
> insert_before (g);
> g = gimple_build_cond (NE_EXPR, idx_next,
> -  size_int (endlimb + (cnt == 1)),
> +  size_int (endlimb + (cnt == 2)),
>NULL_TREE, NULL_TREE);
> insert_before (g);
> edge true_edge, false_edge;
> --- gcc/testsuite/gcc.dg/torture/bitint-59.c.jj   2024-02-21 
> 20:07:11.028142323 +0100
> +++ gcc/testsuite/gcc.dg/torture/bitint-59.c  2024-02-21 20:07:57.854498649 
> +0100
> @@ -0,0 +1,22 @@
> +/* PR tree-optimization/114038 */
> +/* { dg-do run { target bitint } } */
> +/* { dg-options "-std=c23 -pedantic-errors" } */
> +/* { dg-skip-if "" { ! run_expensive_tests }  { "*" } { "-O0" "-O2" } } */
> +/* { dg-skip-if "" { ! run_expensive_tests } { "-flto" } { "" } } */
> +
> +#if __BITINT_MAXWIDTH__ >= 129
> +int
> +foo (unsigned _BitInt(63) x, unsigned _BitInt(129) y)
> +{
> +  return __builtin_mul_overflow_p (y, x, 0);
> +}
> +#endif
> +
> +int
> +main ()
> +{
> +#if __BITINT_MAXWIDTH__ >= 129
> +  if (!foo (90, 0x8000uwb))
> +__builtin_abort ();
> +#endif
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


[PING] Re: [PATCH] analyzer: deal with -fshort-enums

2024-02-22 Thread Torbjorn SVENSSON

Ping!

Kind regards,
Torbjörn

On 2024-02-07 17:21, Torbjorn SVENSSON wrote:

Hi,

Is it okay to backport 3cbab07b08d2f3a3ed34b6ec12e67727c59d285c to 
releases/gcc-13?


Without this backport, I see these failures on arm-none-eabi:

FAIL: gcc.dg/analyzer/switch-enum-1.c  (test for bogus messages, line 26)
FAIL: gcc.dg/analyzer/switch-enum-1.c  (test for bogus messages, line 44)
FAIL: gcc.dg/analyzer/switch-enum-2.c  (test for bogus messages, line 34)
FAIL: gcc.dg/analyzer/switch-enum-2.c  (test for bogus messages, line 52)
FAIL: gcc.dg/analyzer/torture/switch-enum-pr105273-doom-p_floor.c   -O0 
  (test for bogus messages, line 82)
FAIL: gcc.dg/analyzer/torture/switch-enum-pr105273-doom-p_maputl.c   -O0 
   (test for bogus messages, line 83)


Kind regards,
Torbjörn


On 2023-12-06 23:22, David Malcolm wrote:

On Wed, 2023-12-06 at 02:31 -0300, Alexandre Oliva wrote:

On Nov 22, 2023, Alexandre Oliva  wrote:


Ah, nice, that's a great idea, I wish I'd thought of that!  Will
do.


Sorry it took me so long, here it is.  I added two tests, so that,
regardless of the defaults, we get both circumstances tested, without
repetition.

Regstrapped on x86_64-linux-gnu.  Also tested on arm-eabi.  Ok to
install?


Thanks for the updated patch.

Looks good to me.

Dave




analyzer: deal with -fshort-enums

On platforms that enable -fshort-enums by default, various switch-
enum
analyzer tests fail, because apply_constraints_for_gswitch doesn't
expect the integral promotion type cast.  I've arranged for the code
to cope with those casts.


for  gcc/analyzer/ChangeLog

 * region-model.cc (has_nondefault_case_for_value_p): Take
 enumerate type as a parameter.
 (region_model::apply_constraints_for_gswitch): Cope with
 integral promotion type casts.

for  gcc/testsuite/ChangeLog

 * gcc.dg/analyzer/switch-short-enum-1.c: New.
 * gcc.dg/analyzer/switch-no-short-enum-1.c: New.
---
  gcc/analyzer/region-model.cc   |   27 +++-
  .../gcc.dg/analyzer/switch-no-short-enum-1.c   |  141

  .../gcc.dg/analyzer/switch-short-enum-1.c  |  140

  3 files changed, 304 insertions(+), 4 deletions(-)
  create mode 100644 gcc/testsuite/gcc.dg/analyzer/switch-no-short-
enum-1.c
  create mode 100644 gcc/testsuite/gcc.dg/analyzer/switch-short-enum-
1.c

diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-
model.cc
index 2157ad2578b85..6a7a8bc9f4884 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -5387,10 +5387,10 @@ has_nondefault_case_for_value_p (const
gswitch *switch_stmt, tree int_cst)
 has nondefault cases handling all values in the enum.  */
  static bool
-has_nondefault_cases_for_all_enum_values_p (const gswitch
*switch_stmt)
+has_nondefault_cases_for_all_enum_values_p (const gswitch
*switch_stmt,
+   tree type)
  {
    gcc_assert (switch_stmt);
-  tree type = TREE_TYPE (gimple_switch_index (switch_stmt));
    gcc_assert (TREE_CODE (type) == ENUMERAL_TYPE);
    for (tree enum_val_iter = TYPE_VALUES (type);
@@ -5426,6 +5426,23 @@ apply_constraints_for_gswitch (const
switch_cfg_superedge &edge,
  {
    tree index  = gimple_switch_index (switch_stmt);
    const svalue *index_sval = get_rvalue (index, ctxt);
+  bool check_index_type = true;
+
+  /* With -fshort-enum, there may be a type cast.  */
+  if (ctxt && index_sval->get_kind () == SK_UNARYOP
+  && TREE_CODE (index_sval->get_type ()) == INTEGER_TYPE)
+    {
+  const unaryop_svalue *unaryop = as_a 
(index_sval);
+  if (unaryop->get_op () == NOP_EXPR
+ && is_a  (unaryop->get_arg ()))
+   if (const initial_svalue *initvalop = (as_a 
+  (unaryop->get_arg
(
+ if (TREE_CODE (initvalop->get_type ()) == ENUMERAL_TYPE)
+   {
+ index_sval = initvalop;
+ check_index_type = false;
+   }
+    }
    /* If we're switching based on an enum type, assume that the user
is only
   working with values from the enum.  Hence if this is an
@@ -5437,12 +5454,14 @@ apply_constraints_for_gswitch (const
switch_cfg_superedge &edge,
    ctxt
    /* Must be an enum value.  */
    && index_sval->get_type ()
-  && TREE_CODE (TREE_TYPE (index)) == ENUMERAL_TYPE
+  && (!check_index_type
+ || TREE_CODE (TREE_TYPE (index)) == ENUMERAL_TYPE)
    && TREE_CODE (index_sval->get_type ()) == ENUMERAL_TYPE
    /* If we have a constant, then we can check it directly.  */
    && index_sval->get_kind () != SK_CONSTANT
    && edge.implicitly_created_default_p ()
-  && has_nondefault_cases_for_all_enum_values_p (switch_stmt)
+  && has_nondefault_cases_for_all_enum_values_p (switch_stmt,
+    index_sval-

get_type ())

    /* Don't do this if there's a chance that the index is
  attacker-controlled. 

[PATCH] bitintlower: Fix .MUL_OVERFLOW overflow checking [PR114038]

2024-02-22 Thread Jakub Jelinek
Hi!

Currently, bitint_large_huge::lower_mul_overflow uses cnt 1 only if
startlimb == endlimb and in that case doesn't use a loop and handles
everything in a special if:
  unsigned cnt;
  bool use_loop = false;
  if (startlimb == endlimb)
cnt = 1;
  else if (startlimb + 1 == endlimb)
cnt = 2;
  else if ((end % limb_prec) == 0)
{
  cnt = 2;
  use_loop = true;
}
  else
{
  cnt = 3;
  use_loop = startlimb + 2 < endlimb;
}
  if (cnt == 1)
{
  ...
}
  else
The loop handling for the loop exit condition wants to compare if the
incremented index is equal to endlimb, but that is correct only if
end is not divisible by limb_prec and there will be a straight line
check after the loop as well for the most significant limb.  The code
used endlimb + (cnt == 1) for that, but cnt == 1 is never true here,
because cnt is either 2 or 3, so the right check is (cnt == 2).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-02-22  Jakub Jelinek  

PR tree-optimization/114038
* gimple-lower-bitint.cc (bitint_large_huge::lower_mul_overflow): Fix
loop exit condition if end is divisible by limb_prec.

* gcc.dg/torture/bitint-59.c: New test.

--- gcc/gimple-lower-bitint.cc.jj   2024-02-15 09:52:40.999145971 +0100
+++ gcc/gimple-lower-bitint.cc  2024-02-21 20:04:27.590388930 +0100
@@ -4497,7 +4497,7 @@ bitint_large_huge::lower_mul_overflow (t
   size_one_node);
  insert_before (g);
  g = gimple_build_cond (NE_EXPR, idx_next,
-size_int (endlimb + (cnt == 1)),
+size_int (endlimb + (cnt == 2)),
 NULL_TREE, NULL_TREE);
  insert_before (g);
  edge true_edge, false_edge;
--- gcc/testsuite/gcc.dg/torture/bitint-59.c.jj 2024-02-21 20:07:11.028142323 
+0100
+++ gcc/testsuite/gcc.dg/torture/bitint-59.c2024-02-21 20:07:57.854498649 
+0100
@@ -0,0 +1,22 @@
+/* PR tree-optimization/114038 */
+/* { dg-do run { target bitint } } */
+/* { dg-options "-std=c23 -pedantic-errors" } */
+/* { dg-skip-if "" { ! run_expensive_tests }  { "*" } { "-O0" "-O2" } } */
+/* { dg-skip-if "" { ! run_expensive_tests } { "-flto" } { "" } } */
+
+#if __BITINT_MAXWIDTH__ >= 129
+int
+foo (unsigned _BitInt(63) x, unsigned _BitInt(129) y)
+{
+  return __builtin_mul_overflow_p (y, x, 0);
+}
+#endif
+
+int
+main ()
+{
+#if __BITINT_MAXWIDTH__ >= 129
+  if (!foo (90, 0x8000uwb))
+__builtin_abort ();
+#endif
+}

Jakub



Re: [PATCH] c++: -Wuninitialized when binding a ref to uninit DM [PR113987]

2024-02-22 Thread Jason Merrill

On 2/20/24 19:15, Marek Polacek wrote:

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
This PR asks that our -Wuninitialized for mem-initializers does
not warn when binding a reference to an uninitialized data member.
We already check !INDIRECT_TYPE_P in find_uninit_fields_r, but
that won't catch binding a parameter of a reference type to an
uninitialized field, as in:

   struct S { S (int&); };
   struct T {
   T() : s(i) {}
   S s;
   int i;
   };

This patch adds a new function to handle this case.


For type_build_ctor_call types like S, it's weird that we currently 
find_uninit_fields before building the initialization.  What if we move 
the check after the build_aggr_init so we have the actual initializer 
instead of just the expression?



PR c++/113987

gcc/cp/ChangeLog:

* call.cc (conv_binds_to_reference_parm_p): New.
* cp-tree.h (conv_binds_to_reference_parm_p): Declare.
* init.cc (find_uninit_fields_r): Call it.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wuninitialized-15.C: Turn dg-warning into dg-bogus.
* g++.dg/warn/Wuninitialized-34.C: New test.
---
  gcc/cp/call.cc| 24 ++
  gcc/cp/cp-tree.h  |  1 +
  gcc/cp/init.cc|  3 +-
  gcc/testsuite/g++.dg/warn/Wuninitialized-15.C |  3 +-
  gcc/testsuite/g++.dg/warn/Wuninitialized-34.C | 32 +++
  5 files changed, 60 insertions(+), 3 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/warn/Wuninitialized-34.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index 1dac1470d3b..c40ef2e3028 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -14551,4 +14551,28 @@ maybe_show_nonconverting_candidate (tree to, tree 
from, tree arg, int flags)
"function was not considered");
  }
  
+/* We're converting EXPR to TYPE.  If that conversion involves a conversion

+   function and we're binding EXPR to a reference parameter of that function,
+   return true.  */
+
+bool
+conv_binds_to_reference_parm_p (tree type, tree expr)
+{
+  conversion_obstack_sentinel cos;
+  conversion *c = implicit_conversion (type, TREE_TYPE (expr), expr,
+  /*c_cast_p=*/false, LOOKUP_NORMAL,
+  tf_none);
+  if (c && !c->bad_p && c->user_conv_p)
+for (; c; c = next_conversion (c))
+  if (c->kind == ck_user)
+   for (z_candidate *cand = c->cand; cand; cand = cand->next)
+ if (cand->viable == 1)
+   for (size_t i = 0; i < cand->num_convs; ++i)
+ if (cand->convs[i]->kind == ck_ref_bind
+ && conv_get_original_expr (cand->convs[i]) == expr)
+   return true;
+
+  return false;
+}
+
  #include "gt-cp-call.h"
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 334c11396c2..ce2d85f1f86 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -6845,6 +6845,7 @@ extern void cp_warn_deprecated_use_scopes (tree);
  extern tree get_function_version_dispatcher   (tree);
  extern bool any_template_arguments_need_structural_equality_p (tree);
  extern void maybe_show_nonconverting_candidate(tree, tree, tree, int);
+extern bool conv_binds_to_reference_parm_p (tree, tree);
  
  /* in class.cc */

  extern tree build_vfield_ref  (tree, tree);
diff --git a/gcc/cp/init.cc b/gcc/cp/init.cc
index ac37330527e..1a341f7e606 100644
--- a/gcc/cp/init.cc
+++ b/gcc/cp/init.cc
@@ -906,7 +906,8 @@ find_uninit_fields_r (tree *tp, int *walk_subtrees, void 
*data)
warning_at (EXPR_LOCATION (init), OPT_Wuninitialized,
"reference %qD is not yet bound to a value when used "
"here", field);
- else if (!INDIRECT_TYPE_P (type) || is_this_parameter (d->member))
+ else if ((!INDIRECT_TYPE_P (type) || is_this_parameter (d->member))
+  && !conv_binds_to_reference_parm_p (type, init))
warning_at (EXPR_LOCATION (init), OPT_Wuninitialized,
"member %qD is used uninitialized", field);
  *walk_subtrees = false;
diff --git a/gcc/testsuite/g++.dg/warn/Wuninitialized-15.C 
b/gcc/testsuite/g++.dg/warn/Wuninitialized-15.C
index 89e90668c41..2fd33037bfd 100644
--- a/gcc/testsuite/g++.dg/warn/Wuninitialized-15.C
+++ b/gcc/testsuite/g++.dg/warn/Wuninitialized-15.C
@@ -65,8 +65,7 @@ struct H {
G g;
A a2;
H() : g(a1) { }
-  // ??? clang++ doesn't warn here
-  H(int) : g(a2) { } // { dg-warning "member .H::a2. is used uninitialized" }
+  H(int) : g(a2) { } // { dg-bogus "member .H::a2. is used uninitialized" }
  };
  
  struct I {

diff --git a/gcc/testsuite/g++.dg/warn/Wuninitialized-34.C 
b/gcc/testsuite/g++.dg/warn/Wuninitialized-34.C
new file mode 100644
index 000..28226d8032e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wuninitialized-34.C
@@ -0,0 +1,32 @@
+// PR c++/113987
+// { dg-do compile }
+// { dg-optio

[PATCH] c: Handle scoped attributes in __has*attribute and scoped attribute parsing changes in -std=c11 etc. modes [PR114007]

2024-02-22 Thread Jakub Jelinek
Hi!

We aren't able to parse __has_attribute (vendor::attr) (and __has_c_attribute
and __has_cpp_attribute) in strict C < C23 modes.  While in -std=gnu* modes
or in -std=c23 there is CPP_SCOPE token, in -std=c* (except for -std=c23)
there are is just a pair of CPP_COLON tokens.
The c-lex.cc hunk adds support for that.

That leads to a question if we should return 1 or 0 from
__has_attribute (gnu::unused) or not, because while
[[gnu::unused]] is parsed fine in -std=gnu*/-std=c23 modes (sure, with
pedwarn for < C23), we do not parse it at all in -std=c* (except for
-std=c23), we only parse [[__extension__ gnu::unused]] there.  While
the __extension__ in there helps to avoid the pedwarn, I think it is
better to be consistent between GNU and strict C < C23 modes and
parse [[gnu::unused]] too; on the other side, I think parsing
[[__extension__ gnu : : unused]] is too weird and undesirable.

So, the following patch adds a flag during preprocessing at the point
where we normally create CPP_SCOPE tokens out of 2 consecutive colons
on the first CPP_COLON to mark the consecutive case (as we are tight
on the bits, I've reused the PURE_ZERO flag, which is used just by the
C++ FE and only ever set (both C and C++) on CPP_NUMBER tokens, this
new flag has the same value and is only ever used on CPP_COLON tokens)
and instead of checking loose_scope_p argument (i.e. whether it is
[[__extension__ ...]] or not), it just parses CPP_SCOPE or CPP_COLON
with CLONE_SCOPE flag followed by another CPP_COLON the same.
The latter will never appear in >= C23 or -std=gnu* modes, though
guarding its use say with flag_iso && !flag_isoc23 && doesn't really
work because the __extension__ case temporarily clears flag_iso flag.

This makes the -std=c11 etc. behavior more similar to -std=gnu11 or
-std=c23, the only difference I'm aware of are the
#define JOIN2(A, B) A##B
[[vendor JOIN2(:,:) attr]]
[[__extension__ vendor JOIN2(:,:) attr]]
cases, which are accepted in the latter modes, but results in error
in -std=c11; but the error is during preprocessing that :: doesn't
form a valid preprocessing token, which is true, so just don't do that if
you try to have __STRICT_ANSI__ && __STDC_VERSION__ <= 201710L
compatibility.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-02-21  Jakub Jelinek  

PR c/114007
gcc/
* doc/extend.texi: (__extension__): Remove comments about scope
tokens vs. two colons.
gcc/c-family/
* c-lex.cc (c_common_has_attribute): Parse 2 CPP_COLONs with
the first one with COLON_SCOPE flag the same as CPP_SCOPE.
gcc/c/
* c-parser.cc (c_parser_std_attribute): Remove loose_scope_p argument.
Instead of checking it, parse 2 CPP_COLONs with the first one with
COLON_SCOPE flag the same as CPP_SCOPE.
(c_parser_std_attribute_list): Remove loose_scope_p argument, don't
pass it to c_parser_std_attribute.
(c_parser_std_attribute_specifier): Adjust c_parser_std_attribute_list
caller.
gcc/testsuite/
* gcc.dg/c23-attr-syntax-6.c: Adjust testcase for :: being valid
even in -std=c11 even without __extension__ and : : etc. not being
valid anymore even with __extension__.
* gcc.dg/c23-attr-syntax-7.c: Likewise.
* gcc.dg/c23-attr-syntax-8.c: New test.
libcpp/
* include/cpplib.h (COLON_SCOPE): Define to PURE_ZERO.
* lex.cc (_cpp_lex_direct): When lexing CPP_COLON with another
colon after it, if !CPP_OPTION (pfile, scope) set COLON_SCOPE
flag on the first CPP_COLON token.

--- gcc/doc/extend.texi.jj  2024-02-21 10:46:55.746515061 +0100
+++ gcc/doc/extend.texi 2024-02-21 17:03:22.137717754 +0100
@@ -12626,10 +12626,7 @@ In C, writing:
 @end smallexample
 
 suppresses warnings about using @samp{[[]]} attributes in C versions
-that predate C23@.  Since the scope token @samp{::} is not a single
-lexing token in earlier versions of C, this construct also allows two colons
-to be used in place of @code{::}.  GCC does not check whether the two
-colons are immediately adjacent.
+that predate C23@.
 @end itemize
 
 @code{__extension__} has no effect aside from this.
--- gcc/c-family/c-lex.cc.jj2024-01-03 12:07:02.171734141 +0100
+++ gcc/c-family/c-lex.cc   2024-02-21 18:07:37.615640395 +0100
@@ -357,7 +357,24 @@ c_common_has_attribute (cpp_reader *pfil
   do
nxt_token = cpp_peek_token (pfile, idx++);
   while (nxt_token->type == CPP_PADDING);
-  if (nxt_token->type == CPP_SCOPE)
+  if (!c_dialect_cxx ()
+ && nxt_token->type == CPP_COLON
+ && (nxt_token->flags & COLON_SCOPE) != 0)
+   {
+ do
+   nxt_token = cpp_peek_token (pfile, idx++);
+ while (nxt_token->type == CPP_PADDING);
+ if (nxt_token->type == CPP_COLON)
+   {
+ /* __has_attribute (vendor::attr) in -std=c17 etc. modes.
+:: isn't CPP_SCOPE but 2 CPP_COLON tokens, where the
+