date:20221130

Re: [PATCH] longlong.h: Do no use asm input cast for clang

2022-11-30 Thread Richard Biener via Gcc-patches

On Thu, Dec 1, 2022 at 12:26 AM Segher Boessenkool
 wrote:
>
> Hi!
>
> On Wed, Nov 30, 2022 at 03:16:25PM -0300, Adhemerval Zanella via Gcc-patches 
> wrote:
> > clang by default rejects the input casts with:
> >
> >   error: invalid use of a cast in a inline asm context requiring an
> >   lvalue: remove the cast or build with -fheinous-gnu-extensions
> >
> > And even with -fheinous-gnu-extensions clang still throws an warning
> > and also states that this option might be removed in the future.
> > For gcc the cast are still useful somewhat [1], so just remove it
> > clang is used.
>
> This is one of the things in inline asm that is tightly tied to GCC
> internals.  You should emulate GCC's behaviour faithfully if you want
> to claim you implement the inline asm GNU C extension.

I understand that the casts should be no-ops on the asm side (maybe they
change the sign) and they are present as type-checking.  Can we implement
this type-checking in a different (portable) way?  I think the macro you use
should be named like __asm_output_check_type (..) or so to indicate the
intended purpose.

Richard.

> > --- a/include/ChangeLog
> > +++ b/include/ChangeLog
>
> That should not be part of the patch?  Changelog entries should be
> verbatim in the message you send.
>
> The size of this patch already makes clear this is a bad idea, imo.
> This code is already hard enough to read.
>
>
> Segher

Re: [PATCH 2/3]rs6000: NFC use sext_hwi to replace ((v&0xf..f)^0x80..0) - 0x80..0

2022-11-30 Thread Jiufu Guo via Gcc-patches

Hi Kewen,

在 12/1/22 2:11 PM, Kewen.Lin 写道:
> on 2022/12/1 13:35, Jiufu Guo wrote:
>> Hi Kewen,
>>
>> Thanks for your quick and insight review!
>>
>> 在 12/1/22 1:17 PM, Kewen.Lin 写道:
>>> Hi Jeff,
>>>
>>> on 2022/12/1 09:36, Jiufu Guo wrote:
 Hi,

 This patch just uses sext_hwi to replace the expression like:
 ((value & 0xf..f) ^ 0x80..0) - 0x80..0 for rs6000.cc and rs6000.md.

 Bootstrap & regtest pass on ppc64{,le}.
 Is this ok for trunk? 
>>>
>>> You didn't say it clearly but I guessed you have grepped in the whole
>>> config/rs6000 directory, right?  I noticed there are still two places
>>> using this kind of expression in function constant_generates_xxspltiw,
>>> but I assumed it's intentional as their types are not HOST_WIDE_INT.
>>>
>>> gcc/config/rs6000/rs6000.cc:  short sign_h_word = ((h_word & 0x) ^ 
>>> 0x8000) - 0x8000;
>>> gcc/config/rs6000/rs6000.cc:  int sign_word = ((word & 0x) ^ 
>>> 0x8000) - 0x8000;
>>>
>>> If so, could you state it clearly in commit log like "with type
>>> signed/unsigned HOST_WIDE_INT" or similar?
>>>
>> Good question!
>>
>> And as you said sext_hwi is more for "signed/unsigned HOST_WIDE_INT".
>> For these two places, it seems sext_hwi is not needed actually!
>> And I did see why these expressions are used, may be just an assignment
>> is ok.
> 
> ah, I see.  I agree using the assignment is quite enough.  Could you
> please also simplify them together?  Since they are with the form 
> "((value & 0xf..f) ^ 0x80..0) - 0x80..0" too, and can be refactored
> in a better way.  Thanks!

Sure, I believe just "short sign_h_word = vsx_const->half_words[0];"
should be correct :-), and included in the updated patch.

Updated patch is attached,  bootstrap®test is on going.


BR,
Jeff (Jiufu)

> 
> BR,
> Kewen
> From 8aa8e1234b6ec34473434951a3a6177253aac770 Mon Sep 17 00:00:00 2001
From: Jiufu Guo 
Date: Wed, 30 Nov 2022 13:13:37 +0800
Subject: [PATCH 2/2]rs6000: update ((v&0xf..f)^0x80..0) - 0x80..0 with code: 
like sext_hwi

This patch just replaces the expression like: 
((value & 0xf..f) ^ 0x80..0) - 0x80..0 to better code(e.g. sext_hwi) for
rs6000.cc, rs6000.md and predicates.md (files under rs6000/).

gcc/ChangeLog:

* config/rs6000/predicates.md: Use sext_hwi.
* config/rs6000/rs6000.cc (num_insns_constant_gpr): Likewise.
(darwin_rs6000_legitimate_lo_sum_const_p): Likewise.
(mem_operand_gpr): Likewise.
(mem_operand_ds_form): Likewise.
(rs6000_legitimize_address): Likewise.
(rs6000_emit_set_const): Likewise.
(rs6000_emit_set_long_const): Likewise.
(print_operand): Likewise.
(constant_generates_xxspltiw): Remove unnecessary expressions.
* config/rs6000/rs6000.md: Use sext_hwi.

---
 gcc/config/rs6000/predicates.md |  2 +-
 gcc/config/rs6000/rs6000.cc | 36 ++---
 gcc/config/rs6000/rs6000.md | 10 -
 3 files changed, 21 insertions(+), 27 deletions(-)

diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index b1fcc69bb60..a1764018545 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -760,7 +760,7 @@ (define_predicate "easy_vector_constant_add_self"
 return 0;
   elt = BYTES_BIG_ENDIAN ? GET_MODE_NUNITS (mode) - 1 : 0;
   val = const_vector_elt_as_int (op, elt);
-  val = ((val & 0xff) ^ 0x80) - 0x80;
+  val = sext_hwi (val, 8);
   return EASY_VECTOR_15_ADD_SELF (val);
 })
 
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 5efe9b22d8b..dff9a0d8835 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -6021,7 +6021,7 @@ num_insns_constant_gpr (HOST_WIDE_INT value)
 
   else if (TARGET_POWERPC64)
 {
-  HOST_WIDE_INT low  = ((value & 0x) ^ 0x8000) - 0x8000;
+  HOST_WIDE_INT low = sext_hwi (value, 32);
   HOST_WIDE_INT high = value >> 31;
 
   if (high == 0 || high == -1)
@@ -8456,7 +8456,7 @@ darwin_rs6000_legitimate_lo_sum_const_p (rtx x, 
machine_mode mode)
 }
 
   /* We only care if the access(es) would cause a change to the high part.  */
-  offset = ((offset & 0x) ^ 0x8000) - 0x8000;
+  offset = sext_hwi (offset, 16);
   return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);
 }
 
@@ -8522,7 +8522,7 @@ mem_operand_gpr (rtx op, machine_mode mode)
   if (GET_CODE (addr) == LO_SUM)
 /* For lo_sum addresses, we must allow any offset except one that
causes a wrap, so test only the low 16 bits.  */
-offset = ((offset & 0x) ^ 0x8000) - 0x8000;
+offset = sext_hwi (offset, 16);
 
   return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);
 }
@@ -8562,7 +8562,7 @@ mem_operand_ds_form (rtx op, machine_mode mode)
   if (GET_CODE (addr) == LO_SUM)
 /* For lo_sum addresses, we must allow any offset except one that
causes a wrap, so test only the low 16 bits.  */
-offset = ((offset & 0x) ^ 0x8000) - 0x8000;
+offset = sex

Re: [PATCH 2/3]rs6000: NFC use sext_hwi to replace ((v&0xf..f)^0x80..0) - 0x80..0

2022-11-30 Thread Jiufu Guo via Gcc-patches

Hi Kewen,

在 12/1/22 1:30 PM, Kewen.Lin 写道:
> on 2022/12/1 13:17, Kewen.Lin via Gcc-patches wrote:
>> Hi Jeff,
>>
>> on 2022/12/1 09:36, Jiufu Guo wrote:
>>> Hi,
>>>
>>> This patch just uses sext_hwi to replace the expression like:
>>> ((value & 0xf..f) ^ 0x80..0) - 0x80..0 for rs6000.cc and rs6000.md.
>>>
>>> Bootstrap & regtest pass on ppc64{,le}.
>>> Is this ok for trunk? 
>>
>> You didn't say it clearly but I guessed you have grepped in the whole
>> config/rs6000 directory, right?  I noticed there are still two places
>> using this kind of expression in function constant_generates_xxspltiw,
>> but I assumed it's intentional as their types are not HOST_WIDE_INT.
>>
>> gcc/config/rs6000/rs6000.cc:  short sign_h_word = ((h_word & 0x) ^ 
>> 0x8000) - 0x8000;
>> gcc/config/rs6000/rs6000.cc:  int sign_word = ((word & 0x) ^ 
>> 0x8000) - 0x8000;
>>
> 
> oh, one place in gcc/config/rs6000/predicates.md got missed.
> 
> ./predicates.md-756-{
> ./predicates.md-757-  HOST_WIDE_INT val;
> ...
> ./predicates.md-762-  val = const_vector_elt_as_int (op, elt);
> ./predicates.md:763:  val = ((val & 0xff) ^ 0x80) - 0x80;
> ./predicates.md-764-  return EASY_VECTOR_15_ADD_SELF (val);
> ./predicates.md-765-})
> 
> Do you mind to have a further check?

Good catch, thanks!
I will update the patch to cover this one. Bootstrap and testing.

I would be better to check all files under rs6000/. I just rechecked
with grep -r "^.*0x8.*-.*0x8" for rs6000.  No other place is missed.


BR,
Jeff (Jiufu)


Updated patch as attached(add predicates.md).


> 
> Thanks!
> 
> KewenFrom 0059e2175cac5353890965ba782ff58743d2f486 Mon Sep 17 00:00:00 2001
From: Jiufu Guo 
Date: Wed, 30 Nov 2022 13:13:37 +0800
Subject: [PATCH 2/2]rs6000: use sext_hwi to replace ((v&0xf..f)^0x80..0) -
 0x80..0


This patch just uses sext_hwi to replace the expression like:
((value & 0xf..f) ^ 0x80..0) - 0x80..0 for rs6000.cc, rs6000.md and
predicates.md (all occurance under rs6000/).


gcc/ChangeLog:

* config/rs6000/predicates.md: Use sext_hwi.
* config/rs6000/rs6000.cc (num_insns_constant_gpr): Use sext_hwi.
(darwin_rs6000_legitimate_lo_sum_const_p): Likewise.
(mem_operand_gpr): Likewise.
(mem_operand_ds_form): Likewise.
(rs6000_legitimize_address): Likewise.
(rs6000_emit_set_const): Likewise.
(rs6000_emit_set_long_const): Likewise.
(print_operand): Likewise.
* config/rs6000/rs6000.md: Likewise.

---
 gcc/config/rs6000/predicates.md |  2 +-
 gcc/config/rs6000/rs6000.cc | 30 +-
 gcc/config/rs6000/rs6000.md | 10 +-
 3 files changed, 19 insertions(+), 23 deletions(-)

diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index b1fcc69bb60..a1764018545 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -760,7 +760,7 @@ (define_predicate "easy_vector_constant_add_self"
 return 0;
   elt = BYTES_BIG_ENDIAN ? GET_MODE_NUNITS (mode) - 1 : 0;
   val = const_vector_elt_as_int (op, elt);
-  val = ((val & 0xff) ^ 0x80) - 0x80;
+  val = sext_hwi (val, 8);
   return EASY_VECTOR_15_ADD_SELF (val);
 })
 
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 5efe9b22d8b..718072cc9a1 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -6021,7 +6021,7 @@ num_insns_constant_gpr (HOST_WIDE_INT value)
 
   else if (TARGET_POWERPC64)
 {
-  HOST_WIDE_INT low  = ((value & 0x) ^ 0x8000) - 0x8000;
+  HOST_WIDE_INT low = sext_hwi (value, 32);
   HOST_WIDE_INT high = value >> 31;
 
   if (high == 0 || high == -1)
@@ -8456,7 +8456,7 @@ darwin_rs6000_legitimate_lo_sum_const_p (rtx x, 
machine_mode mode)
 }
 
   /* We only care if the access(es) would cause a change to the high part.  */
-  offset = ((offset & 0x) ^ 0x8000) - 0x8000;
+  offset = sext_hwi (offset, 16);
   return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);
 }
 
@@ -8522,7 +8522,7 @@ mem_operand_gpr (rtx op, machine_mode mode)
   if (GET_CODE (addr) == LO_SUM)
 /* For lo_sum addresses, we must allow any offset except one that
causes a wrap, so test only the low 16 bits.  */
-offset = ((offset & 0x) ^ 0x8000) - 0x8000;
+offset = sext_hwi (offset, 16);
 
   return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);
 }
@@ -8562,7 +8562,7 @@ mem_operand_ds_form (rtx op, machine_mode mode)
   if (GET_CODE (addr) == LO_SUM)
 /* For lo_sum addresses, we must allow any offset except one that
causes a wrap, so test only the low 16 bits.  */
-offset = ((offset & 0x) ^ 0x8000) - 0x8000;
+offset = sext_hwi (offset, 16);
 
   return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);
 }
@@ -9136,7 +9136,7 @@ rs6000_legitimize_address (rtx x, rtx oldx 
ATTRIBUTE_UNUSED,
 {
   HOST_WIDE_INT high_int, low_int;
   rtx sum;
-  low_int = ((INTVAL (XEXP (x, 1)) & 0x) ^ 0x8000) - 0x8000;
+  low

[PATCH] [x86] Fix ICE due to incorrect insn type.

2022-11-30 Thread liuhongt via Gcc-patches

;; if reg/mem op
(define_insn_reservation  "slm_sseishft_3" 2
  (and (eq_attr "cpu" "slm")
   (and (eq_attr "type" "sseishft")
(not (match_operand 2 "immediate_operand"
  "slm-complex, slm-all-eu")

in slm.md it will check operands[2] for type sseishft, but for
extendbfsf2_1 there's no second operand which caused ICE.
The patch set type from sseishft to sseishft1 to fix the issue.

Bootstrapped and regtested on x86_64-pc-linu-gnu{-m32,}.
Ready to push as an obvious patch.

gcc/ChangeLog:

PR target/107934
* config/i386/i386.md (extendbfsf2_1): Change type from
sseishft to sseishft1.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr107934.c: New test.
---
 gcc/config/i386/i386.md  | 2 +-
 gcc/testsuite/gcc.target/i386/pr107934.c | 8 
 2 files changed, 9 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr107934.c

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 9451883396c..9e1d9eec862 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -4981,7 +4981,7 @@ (define_insn "extendbfsf2_1"
   pslld\t{$16, %0|%0, 16}
   vpslld\t{$16, %1, %0|%0, %1, 16}"
   [(set_attr "isa" "noavx,avx")
-   (set_attr "type" "sseishft")
+   (set_attr "type" "sseishft1")
(set_attr "length_immediate" "1")
(set_attr "prefix_data16" "1,*")
(set_attr "prefix" "orig,vex")
diff --git a/gcc/testsuite/gcc.target/i386/pr107934.c 
b/gcc/testsuite/gcc.target/i386/pr107934.c
new file mode 100644
index 000..59106b29159
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr107934.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mtune=knl -ffinite-math-only -msse2" } */
+
+int
+foo (__bf16 bf)
+{
+  return bf;
+}
-- 
2.27.0

Re: [PATCH 2/3]rs6000: NFC use sext_hwi to replace ((v&0xf..f)^0x80..0) - 0x80..0

2022-11-30 Thread Kewen.Lin via Gcc-patches

on 2022/12/1 13:35, Jiufu Guo wrote:
> Hi Kewen,
> 
> Thanks for your quick and insight review!
> 
> 在 12/1/22 1:17 PM, Kewen.Lin 写道:
>> Hi Jeff,
>>
>> on 2022/12/1 09:36, Jiufu Guo wrote:
>>> Hi,
>>>
>>> This patch just uses sext_hwi to replace the expression like:
>>> ((value & 0xf..f) ^ 0x80..0) - 0x80..0 for rs6000.cc and rs6000.md.
>>>
>>> Bootstrap & regtest pass on ppc64{,le}.
>>> Is this ok for trunk? 
>>
>> You didn't say it clearly but I guessed you have grepped in the whole
>> config/rs6000 directory, right?  I noticed there are still two places
>> using this kind of expression in function constant_generates_xxspltiw,
>> but I assumed it's intentional as their types are not HOST_WIDE_INT.
>>
>> gcc/config/rs6000/rs6000.cc:  short sign_h_word = ((h_word & 0x) ^ 
>> 0x8000) - 0x8000;
>> gcc/config/rs6000/rs6000.cc:  int sign_word = ((word & 0x) ^ 
>> 0x8000) - 0x8000;
>>
>> If so, could you state it clearly in commit log like "with type
>> signed/unsigned HOST_WIDE_INT" or similar?
>>
> Good question!
> 
> And as you said sext_hwi is more for "signed/unsigned HOST_WIDE_INT".
> For these two places, it seems sext_hwi is not needed actually!
> And I did see why these expressions are used, may be just an assignment
> is ok.

ah, I see.  I agree using the assignment is quite enough.  Could you
please also simplify them together?  Since they are with the form 
"((value & 0xf..f) ^ 0x80..0) - 0x80..0" too, and can be refactored
in a better way.  Thanks!

BR,
Kewen

Re: [PATCH 2/3]rs6000: NFC use sext_hwi to replace ((v&0xf..f)^0x80..0) - 0x80..0

2022-11-30 Thread Jiufu Guo via Gcc-patches

Hi Kewen,

Thanks for your quick and insight review!

在 12/1/22 1:17 PM, Kewen.Lin 写道:
> Hi Jeff,
> 
> on 2022/12/1 09:36, Jiufu Guo wrote:
>> Hi,
>>
>> This patch just uses sext_hwi to replace the expression like:
>> ((value & 0xf..f) ^ 0x80..0) - 0x80..0 for rs6000.cc and rs6000.md.
>>
>> Bootstrap & regtest pass on ppc64{,le}.
>> Is this ok for trunk? 
> 
> You didn't say it clearly but I guessed you have grepped in the whole
> config/rs6000 directory, right?  I noticed there are still two places
> using this kind of expression in function constant_generates_xxspltiw,
> but I assumed it's intentional as their types are not HOST_WIDE_INT.
> 
> gcc/config/rs6000/rs6000.cc:  short sign_h_word = ((h_word & 0x) ^ 
> 0x8000) - 0x8000;
> gcc/config/rs6000/rs6000.cc:  int sign_word = ((word & 0x) ^ 
> 0x8000) - 0x8000;
> 
> If so, could you state it clearly in commit log like "with type
> signed/unsigned HOST_WIDE_INT" or similar?
> 
Good question!

And as you said sext_hwi is more for "signed/unsigned HOST_WIDE_INT".
For these two places, it seems sext_hwi is not needed actually!
And I did see why these expressions are used, may be just an assignment
is ok.

So, this patch does not cover these two places.

BR,
Jeff (Jiufu)

> This patch is OK once the above question gets confirmed, thanks!
> 
> BR,
> Kewen
> 
>>
>> BR,
>> Jeff (Jiufu)
>>
>> gcc/ChangeLog:
>>
>>  * config/rs6000/rs6000.cc (num_insns_constant_gpr): Use sext_hwi.
>>  (darwin_rs6000_legitimate_lo_sum_const_p): Likewise.
>>  (mem_operand_gpr): Likewise.
>>  (mem_operand_ds_form): Likewise.
>>  (rs6000_legitimize_address): Likewise.
>>  (rs6000_emit_set_const): Likewise.
>>  (rs6000_emit_set_long_const): Likewise.
>>  (print_operand): Likewise.
>>  * config/rs6000/rs6000.md: Likewise.
>>
>> ---
>>  gcc/config/rs6000/rs6000.cc | 30 +-
>>  gcc/config/rs6000/rs6000.md | 10 +-
>>  2 files changed, 18 insertions(+), 22 deletions(-)
>>
>> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
>> index 5efe9b22d8b..718072cc9a1 100644
>> --- a/gcc/config/rs6000/rs6000.cc
>> +++ b/gcc/config/rs6000/rs6000.cc
>> @@ -6021,7 +6021,7 @@ num_insns_constant_gpr (HOST_WIDE_INT value)
>>
>>else if (TARGET_POWERPC64)
>>  {
>> -  HOST_WIDE_INT low  = ((value & 0x) ^ 0x8000) - 0x8000;
>> +  HOST_WIDE_INT low = sext_hwi (value, 32);
>>HOST_WIDE_INT high = value >> 31;
>>
>>if (high == 0 || high == -1)
>> @@ -8456,7 +8456,7 @@ darwin_rs6000_legitimate_lo_sum_const_p (rtx x, 
>> machine_mode mode)
>>  }
>>
>>/* We only care if the access(es) would cause a change to the high part.  
>> */
>> -  offset = ((offset & 0x) ^ 0x8000) - 0x8000;
>> +  offset = sext_hwi (offset, 16);
>>return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);
>>  }
>>
>> @@ -8522,7 +8522,7 @@ mem_operand_gpr (rtx op, machine_mode mode)
>>if (GET_CODE (addr) == LO_SUM)
>>  /* For lo_sum addresses, we must allow any offset except one that
>> causes a wrap, so test only the low 16 bits.  */
>> -offset = ((offset & 0x) ^ 0x8000) - 0x8000;
>> +offset = sext_hwi (offset, 16);
>>
>>return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);
>>  }
>> @@ -8562,7 +8562,7 @@ mem_operand_ds_form (rtx op, machine_mode mode)
>>if (GET_CODE (addr) == LO_SUM)
>>  /* For lo_sum addresses, we must allow any offset except one that
>> causes a wrap, so test only the low 16 bits.  */
>> -offset = ((offset & 0x) ^ 0x8000) - 0x8000;
>> +offset = sext_hwi (offset, 16);
>>
>>return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);
>>  }
>> @@ -9136,7 +9136,7 @@ rs6000_legitimize_address (rtx x, rtx oldx 
>> ATTRIBUTE_UNUSED,
>>  {
>>HOST_WIDE_INT high_int, low_int;
>>rtx sum;
>> -  low_int = ((INTVAL (XEXP (x, 1)) & 0x) ^ 0x8000) - 0x8000;
>> +  low_int = sext_hwi (INTVAL (XEXP (x, 1)), 16);
>>if (low_int >= 0x8000 - extra)
>>  low_int = 0;
>>high_int = INTVAL (XEXP (x, 1)) - low_int;
>> @@ -10203,7 +10203,7 @@ rs6000_emit_set_const (rtx dest, rtx source)
>>lo = operand_subword_force (dest, WORDS_BIG_ENDIAN != 0,
>>DImode);
>>emit_move_insn (hi, GEN_INT (c >> 32));
>> -  c = ((c & 0x) ^ 0x8000) - 0x8000;
>> +  c = sext_hwi (c, 32);
>>emit_move_insn (lo, GEN_INT (c));
>>  }
>>else
>> @@ -10242,7 +10242,7 @@ rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT 
>> c)
>>
>>if ((ud4 == 0x && ud3 == 0x && ud2 == 0x && (ud1 & 0x8000))
>>|| (ud4 == 0 && ud3 == 0 && ud2 == 0 && ! (ud1 & 0x8000)))
>> -emit_move_insn (dest, GEN_INT ((ud1 ^ 0x8000) - 0x8000));
>> +emit_move_insn (dest, GEN_INT (sext_hwi (ud1, 16)));
>>
>>else if ((ud4 == 0x && ud3 == 0x && (ud2 & 0x8000))
>> || (ud4 == 0 && ud3 == 0 && ! (ud2 & 0x8000)))

Re: [PATCH 2/3]rs6000: NFC use sext_hwi to replace ((v&0xf..f)^0x80..0) - 0x80..0

2022-11-30 Thread Kewen.Lin via Gcc-patches

on 2022/12/1 13:17, Kewen.Lin via Gcc-patches wrote:
> Hi Jeff,
> 
> on 2022/12/1 09:36, Jiufu Guo wrote:
>> Hi,
>>
>> This patch just uses sext_hwi to replace the expression like:
>> ((value & 0xf..f) ^ 0x80..0) - 0x80..0 for rs6000.cc and rs6000.md.
>>
>> Bootstrap & regtest pass on ppc64{,le}.
>> Is this ok for trunk? 
> 
> You didn't say it clearly but I guessed you have grepped in the whole
> config/rs6000 directory, right?  I noticed there are still two places
> using this kind of expression in function constant_generates_xxspltiw,
> but I assumed it's intentional as their types are not HOST_WIDE_INT.
> 
> gcc/config/rs6000/rs6000.cc:  short sign_h_word = ((h_word & 0x) ^ 
> 0x8000) - 0x8000;
> gcc/config/rs6000/rs6000.cc:  int sign_word = ((word & 0x) ^ 
> 0x8000) - 0x8000;
> 

oh, one place in gcc/config/rs6000/predicates.md got missed.

./predicates.md-756-{
./predicates.md-757-  HOST_WIDE_INT val;
...
./predicates.md-762-  val = const_vector_elt_as_int (op, elt);
./predicates.md:763:  val = ((val & 0xff) ^ 0x80) - 0x80;
./predicates.md-764-  return EASY_VECTOR_15_ADD_SELF (val);
./predicates.md-765-})

Do you mind to have a further check?

Thanks!

Kewen

Re: [PATCH 2/3]rs6000: NFC use sext_hwi to replace ((v&0xf..f)^0x80..0) - 0x80..0

2022-11-30 Thread Kewen.Lin via Gcc-patches

Hi Jeff,

on 2022/12/1 09:36, Jiufu Guo wrote:
> Hi,
> 
> This patch just uses sext_hwi to replace the expression like:
> ((value & 0xf..f) ^ 0x80..0) - 0x80..0 for rs6000.cc and rs6000.md.
> 
> Bootstrap & regtest pass on ppc64{,le}.
> Is this ok for trunk? 

You didn't say it clearly but I guessed you have grepped in the whole
config/rs6000 directory, right?  I noticed there are still two places
using this kind of expression in function constant_generates_xxspltiw,
but I assumed it's intentional as their types are not HOST_WIDE_INT.

gcc/config/rs6000/rs6000.cc:  short sign_h_word = ((h_word & 0x) ^ 
0x8000) - 0x8000;
gcc/config/rs6000/rs6000.cc:  int sign_word = ((word & 0x) ^ 
0x8000) - 0x8000;

If so, could you state it clearly in commit log like "with type
signed/unsigned HOST_WIDE_INT" or similar?

This patch is OK once the above question gets confirmed, thanks!

BR,
Kewen

> 
> BR,
> Jeff (Jiufu)
> 
> gcc/ChangeLog:
> 
>   * config/rs6000/rs6000.cc (num_insns_constant_gpr): Use sext_hwi.
>   (darwin_rs6000_legitimate_lo_sum_const_p): Likewise.
>   (mem_operand_gpr): Likewise.
>   (mem_operand_ds_form): Likewise.
>   (rs6000_legitimize_address): Likewise.
>   (rs6000_emit_set_const): Likewise.
>   (rs6000_emit_set_long_const): Likewise.
>   (print_operand): Likewise.
>   * config/rs6000/rs6000.md: Likewise.
> 
> ---
>  gcc/config/rs6000/rs6000.cc | 30 +-
>  gcc/config/rs6000/rs6000.md | 10 +-
>  2 files changed, 18 insertions(+), 22 deletions(-)
> 
> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> index 5efe9b22d8b..718072cc9a1 100644
> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -6021,7 +6021,7 @@ num_insns_constant_gpr (HOST_WIDE_INT value)
> 
>else if (TARGET_POWERPC64)
>  {
> -  HOST_WIDE_INT low  = ((value & 0x) ^ 0x8000) - 0x8000;
> +  HOST_WIDE_INT low = sext_hwi (value, 32);
>HOST_WIDE_INT high = value >> 31;
> 
>if (high == 0 || high == -1)
> @@ -8456,7 +8456,7 @@ darwin_rs6000_legitimate_lo_sum_const_p (rtx x, 
> machine_mode mode)
>  }
> 
>/* We only care if the access(es) would cause a change to the high part.  
> */
> -  offset = ((offset & 0x) ^ 0x8000) - 0x8000;
> +  offset = sext_hwi (offset, 16);
>return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);
>  }
> 
> @@ -8522,7 +8522,7 @@ mem_operand_gpr (rtx op, machine_mode mode)
>if (GET_CODE (addr) == LO_SUM)
>  /* For lo_sum addresses, we must allow any offset except one that
> causes a wrap, so test only the low 16 bits.  */
> -offset = ((offset & 0x) ^ 0x8000) - 0x8000;
> +offset = sext_hwi (offset, 16);
> 
>return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);
>  }
> @@ -8562,7 +8562,7 @@ mem_operand_ds_form (rtx op, machine_mode mode)
>if (GET_CODE (addr) == LO_SUM)
>  /* For lo_sum addresses, we must allow any offset except one that
> causes a wrap, so test only the low 16 bits.  */
> -offset = ((offset & 0x) ^ 0x8000) - 0x8000;
> +offset = sext_hwi (offset, 16);
> 
>return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);
>  }
> @@ -9136,7 +9136,7 @@ rs6000_legitimize_address (rtx x, rtx oldx 
> ATTRIBUTE_UNUSED,
>  {
>HOST_WIDE_INT high_int, low_int;
>rtx sum;
> -  low_int = ((INTVAL (XEXP (x, 1)) & 0x) ^ 0x8000) - 0x8000;
> +  low_int = sext_hwi (INTVAL (XEXP (x, 1)), 16);
>if (low_int >= 0x8000 - extra)
>   low_int = 0;
>high_int = INTVAL (XEXP (x, 1)) - low_int;
> @@ -10203,7 +10203,7 @@ rs6000_emit_set_const (rtx dest, rtx source)
> lo = operand_subword_force (dest, WORDS_BIG_ENDIAN != 0,
> DImode);
> emit_move_insn (hi, GEN_INT (c >> 32));
> -   c = ((c & 0x) ^ 0x8000) - 0x8000;
> +   c = sext_hwi (c, 32);
> emit_move_insn (lo, GEN_INT (c));
>   }
>else
> @@ -10242,7 +10242,7 @@ rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT c)
> 
>if ((ud4 == 0x && ud3 == 0x && ud2 == 0x && (ud1 & 0x8000))
>|| (ud4 == 0 && ud3 == 0 && ud2 == 0 && ! (ud1 & 0x8000)))
> -emit_move_insn (dest, GEN_INT ((ud1 ^ 0x8000) - 0x8000));
> +emit_move_insn (dest, GEN_INT (sext_hwi (ud1, 16)));
> 
>else if ((ud4 == 0x && ud3 == 0x && (ud2 & 0x8000))
>  || (ud4 == 0 && ud3 == 0 && ! (ud2 & 0x8000)))
> @@ -10250,7 +10250,7 @@ rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT c)
>temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode);
> 
>emit_move_insn (ud1 != 0 ? copy_rtx (temp) : dest,
> -   GEN_INT (((ud2 << 16) ^ 0x8000) - 0x8000));
> +   GEN_INT (sext_hwi (ud2 << 16, 32)));
>if (ud1 != 0)
>   emit_move_insn (dest,
>   gen_rtx_IOR (DImode, copy_rtx (temp),
> @@ -10261,8 +10261,7 @@ rs6000_e

Re: [PATCH 3/3]rs6000: NFC no need copy_rtx in rs6000_emit_set_long_const and rs6000_emit_set_const

2022-11-30 Thread Jiufu Guo via Gcc-patches



Hi Kewen,

在 12/1/22 11:31 AM, Kewen.Lin 写道:
> Hi Jeff,
> 
> on 2022/12/1 09:36, Jiufu Guo wrote:
>> Hi,
>>
>> Function rs6000_emit_set_const/rs6000_emit_set_long_const are only invoked 
>> from
>> two "define_split"s where the target operand is limited to gpc_reg_operand or
>> int_reg_operand, then the operand must be REG_P.
>> And in rs6000_emit_set_const/rs6000_emit_set_long_const, to create temp rtx,
>> it is using code like "gen_reg_rtx({S|D}Imode)", it must also be REG_P.
>> So, copy_rtx is not needed for temp and dest.
>>
>> This patch removes those "copy_rtx" for rs6000_emit_set_const and
>> rs6000_emit_set_long_const.
>>
>> Bootstrap & regtest pass on ppc64{,le}.
>> Is this ok for trunk? 
> 
> This patch is okay, thanks!  For the subject, IMHO it's better to use 
> something
> like: "rs6000: Remove useless copy_rtx in rs6000_emit_set_{,long}_const".
> I don't see NFC tag used much in GCC, though it's used a lot in llvm, but
> anyway you can append (NFC)/[NFC] at the end if you like.  :)
> 

"rs6000: Remove useless copy_rtx in rs6000_emit_set_{,long}_const" is great!

Thanks for your review and suggestions!


BR,
Jeff (Jiufu)

> BR,
> Kewen

[PATCH] libcpp: suppress builtin macro redefined warnings for LINE

2022-11-30 Thread Longjun Luo via Gcc-patches

As implied in
gcc.gnu.org/legacy-ml/gcc-patches/2008-09/msg00076.html,
gcc provides -Wno-builtin-macro-redefined to suppress warning when
redefining builtin macro. However, at that time, there was no
scenario for __LINE__ macro.

But, when we try to build a live-patch, we compare sections by using
-ffunction-sections. Some same functions are considered changed because
of __LINE__ macro.

At present, to detect such a changed caused by __LINE__ macro, we
have to analyse code and maintain a function list. For example,
in kpatch, check this commit
github.com/dynup/kpatch/commit/0e1b95edeafa36edb7bcf11da6d1c00f76d7e03d.

So, in this scenario, when we try to compared sections, it would
be better to support suppress builtin macro redefined warnings for
__LINE__ macro.

Signed-off-by: Longjun Luo 
---
 gcc/testsuite/gcc.dg/builtin-redefine.c | 1 -
 libcpp/init.cc  | 2 +-
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/builtin-redefine.c 
b/gcc/testsuite/gcc.dg/builtin-redefine.c
index 882b2210992..9d5b42252ee 100644
--- a/gcc/testsuite/gcc.dg/builtin-redefine.c
+++ b/gcc/testsuite/gcc.dg/builtin-redefine.c
@@ -71,7 +71,6 @@
 /* { dg-bogus "Expected built-in is not defined" "" { target *-*-* } .-1 } */
 #endif
 
-#define __LINE__ 0   /* { dg-warning "-:\"__LINE__\" redef" } */
 #define __INCLUDE_LEVEL__ 0  /* { dg-warning "-:\"__INCLUDE_LEVEL__\" redef" } 
*/
 #define __COUNTER__ 0/* { dg-warning "-:\"__COUNTER__\" redef" } */
 
diff --git a/libcpp/init.cc b/libcpp/init.cc
index 5f34e3515d2..2765b9838b7 100644
--- a/libcpp/init.cc
+++ b/libcpp/init.cc
@@ -421,7 +421,7 @@ static const struct builtin_macro builtin_array[] =
   B("__FILE__", BT_FILE,  false),
   B("__FILE_NAME__",BT_FILE_NAME, false),
   B("__BASE_FILE__",BT_BASE_FILE, false),
-  B("__LINE__", BT_SPECLINE,  true),
+  B("__LINE__", BT_SPECLINE,  false),
   B("__INCLUDE_LEVEL__", BT_INCLUDE_LEVEL, true),
   B("__COUNTER__",  BT_COUNTER,   true),
   /* Make sure to update the list of built-in
-- 
2.38.1

[pushed] c++: small contracts fixes

2022-11-30 Thread Jason Merrill via Gcc-patches

Tested x86_64-pc-linux-gnu, applying to trunk.

-- 8< --

The first is an actual bug: remove_contract_attributes was only keeping one
attribute.  The second just helps flow analysis in optimizers and static
analyzers.

gcc/cp/ChangeLog:

* contracts.cc (remove_contract_attributes): Actually prepend
to the list.
* pt.cc (tsubst_contract): Only look for a postcondition if type is
nonnull.
---
 gcc/cp/contracts.cc | 2 +-
 gcc/cp/pt.cc| 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/cp/contracts.cc b/gcc/cp/contracts.cc
index a9097016768..45f52b20392 100644
--- a/gcc/cp/contracts.cc
+++ b/gcc/cp/contracts.cc
@@ -869,7 +869,7 @@ remove_contract_attributes (tree fndecl)
   tree list = NULL_TREE;
   for (tree p = DECL_ATTRIBUTES (fndecl); p; p = TREE_CHAIN (p))
 if (!cxx_contract_attribute_p (p))
-  list = tree_cons (TREE_PURPOSE (p), TREE_VALUE (p), NULL_TREE);
+  list = tree_cons (TREE_PURPOSE (p), TREE_VALUE (p), list);
   DECL_ATTRIBUTES (fndecl) = nreverse (list);
 }
 
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 2d8e4fdd4b5..08de273a900 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -11561,7 +11561,7 @@ tsubst_contract (tree decl, tree t, tree args, 
tsubst_flags_t complain,
   tree r = copy_node (t);
 
   /* Rebuild the result variable.  */
-  if (POSTCONDITION_P (t) && POSTCONDITION_IDENTIFIER (t))
+  if (type && POSTCONDITION_P (t) && POSTCONDITION_IDENTIFIER (t))
 {
   tree oldvar = POSTCONDITION_IDENTIFIER (t);
 

base-commit: cda29c540037fbcf00a377196050953aab1d3d5b
-- 
2.31.1

Re: [PATCH 3/3]rs6000: NFC no need copy_rtx in rs6000_emit_set_long_const and rs6000_emit_set_const

2022-11-30 Thread Kewen.Lin via Gcc-patches

Hi Jeff,

on 2022/12/1 09:36, Jiufu Guo wrote:
> Hi,
> 
> Function rs6000_emit_set_const/rs6000_emit_set_long_const are only invoked 
> from
> two "define_split"s where the target operand is limited to gpc_reg_operand or
> int_reg_operand, then the operand must be REG_P.
> And in rs6000_emit_set_const/rs6000_emit_set_long_const, to create temp rtx,
> it is using code like "gen_reg_rtx({S|D}Imode)", it must also be REG_P.
> So, copy_rtx is not needed for temp and dest.
> 
> This patch removes those "copy_rtx" for rs6000_emit_set_const and
> rs6000_emit_set_long_const.
> 
> Bootstrap & regtest pass on ppc64{,le}.
> Is this ok for trunk? 

This patch is okay, thanks!  For the subject, IMHO it's better to use something
like: "rs6000: Remove useless copy_rtx in rs6000_emit_set_{,long}_const".
I don't see NFC tag used much in GCC, though it's used a lot in llvm, but
anyway you can append (NFC)/[NFC] at the end if you like.  :)

BR,
Kewen

Re: [PATCH] RISC-V: optimize stack manipulation in save-restore

2022-11-30 Thread Fei Gao

On 2022-12-01 06:50  Palmer Dabbelt  wrote:
>
>On Wed, 30 Nov 2022 00:37:17 PST (-0800), gao...@eswincomputing.com wrote:
>> The stack that save-restore reserves is not well accumulated in stack 
>> allocation and deallocation.
>> This patch allows less instructions to be used in stack allocation and 
>> deallocation if save-restore enabled,
>> and also a much clear logic for save-restore stack manipulation.
>>
>> before patch:
>> bar:
>> call t0,__riscv_save_4
>> addi sp,sp,-64
>> ...
>> li   t0,-12288
>> addi t0,t0,-1968 # optimized out after patch
>> add  sp,sp,t0 # prologue
>> ...
>> li   t0,12288 # epilogue
>> addi t0,t0,2000 # optimized out after patch
>> add  sp,sp,t0
>> ...
>> addi sp,sp,32
>> tail __riscv_restore_4
>>
>> after patch:
>> bar:
>> call t0,__riscv_save_4
>> addi sp,sp,-2032
>> ...
>> li   t0,-12288
>> add  sp,sp,t0 # prologue
>> ...
>> li   t0,12288 # epilogue
>> add  sp,sp,t0
>> ...
>> addi sp,sp,2032
>> tail __riscv_restore_4
>>
>> gcc/ChangeLog:
>>
>> * config/riscv/riscv.cc (riscv_first_stack_step): add a new function 
>>parameter remaining_size.
>> (riscv_compute_frame_info): adapt new riscv_first_stack_step 
>>interface.
>> (riscv_expand_prologue): consider save-restore in stack allocation.
>> (riscv_expand_epilogue): consider save-restore in stack deallocation.
>>
>> gcc/testsuite/ChangeLog:
>>
>> * gcc.target/riscv/stack_save_restore.c: New test.
>> ---
>>  gcc/config/riscv/riscv.cc | 58 ++-
>>  .../gcc.target/riscv/stack_save_restore.c | 40 +
>>  2 files changed, 70 insertions(+), 28 deletions(-)
>>  create mode 100644 gcc/testsuite/gcc.target/riscv/stack_save_restore.c
>
>I guess with the RISC-V backend still being open for things as big as
>the V port we should probably be taking code like this as well?  I
>wouldn't be opposed to making an exception for the V code and holding
>everything else back, though.
>
>> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
>> index 05bdba5ab4d..9e92e729a5f 100644
>> --- a/gcc/config/riscv/riscv.cc
>> +++ b/gcc/config/riscv/riscv.cc
>> @@ -4634,7 +4634,7 @@ riscv_save_libcall_count (unsigned mask)
>> They decrease stack_pointer_rtx but leave frame_pointer_rtx and
>> hard_frame_pointer_rtx unchanged.  */
>>
>> -static HOST_WIDE_INT riscv_first_stack_step (struct riscv_frame_info 
>> *frame);
>> +static HOST_WIDE_INT riscv_first_stack_step (struct riscv_frame_info 
>> *frame, poly_int64 remaining_size);
>>
>>  /* Handle stack align for poly_int.  */
>>  static poly_int64
>> @@ -4663,7 +4663,7 @@ riscv_compute_frame_info (void)
>>   save/restore t0.  We check for this before clearing the frame struct.  
>>*/
>>    if (cfun->machine->interrupt_handler_p)
>>  {
>> -  HOST_WIDE_INT step1 = riscv_first_stack_step (frame);
>> +  HOST_WIDE_INT step1 = riscv_first_stack_step (frame, 
>> frame->total_size);
>>    if (! POLY_SMALL_OPERAND_P ((frame->total_size - step1)))
>>  interrupt_save_prologue_temp = true;
>>  }
>> @@ -4913,31 +4913,31 @@ riscv_restore_reg (rtx reg, rtx mem)
>> without adding extra instructions.  */
>>
>>  static HOST_WIDE_INT
>> -riscv_first_stack_step (struct riscv_frame_info *frame)
>> +riscv_first_stack_step (struct riscv_frame_info *frame, poly_int64 
>> remaining_size)
>>  {
>> -  HOST_WIDE_INT frame_total_constant_size;
>> -  if (!frame->total_size.is_constant ())
>> -    frame_total_constant_size
>> -  = riscv_stack_align (frame->total_size.coeffs[0])
>> -- riscv_stack_align (frame->total_size.coeffs[1]);
>> +  HOST_WIDE_INT remaining_const_size;
>> +  if (!remaining_size.is_constant ())
>> +    remaining_const_size
>> +  = riscv_stack_align (remaining_size.coeffs[0])
>> +- riscv_stack_align (remaining_size.coeffs[1]);
>
>The alignment looks off here, at least in the email.  Worth fixing it up
>if you're touching the lines anyway. 

Sure, i will change RISCV_STACK_ALIGN into riscv_stack_align.

>
>>    else
>> -    frame_total_constant_size = frame->total_size.to_constant ();
>> +    remaining_const_size = remaining_size.to_constant ();
>>
>> -  if (SMALL_OPERAND (frame_total_constant_size))
>> -    return frame_total_constant_size;
>> +  if (SMALL_OPERAND (remaining_const_size))
>> +    return remaining_const_size;
>>
>>    HOST_WIDE_INT min_first_step =
>> -    RISCV_STACK_ALIGN ((frame->total_size - 
>> frame->frame_pointer_offset).to_constant());
>> +    RISCV_STACK_ALIGN ((remaining_size - 
>> frame->frame_pointer_offset).to_constant());
>>    HOST_WIDE_INT max_first_step = IMM_REACH / 2 - PREFERRED_STACK_BOUNDARY / 
>>8;
>> -  HOST_WIDE_INT min_second_step = frame_total_constant_size - 
>> max_first_step;
>> +  HOST_WIDE_INT min_second_step = remaining_const_size - max_first_step;
>>    gcc_assert (min_first_step <= max_first_step);
>>
>>    /* As an optimization, use the least-significant bits of the total frame
>>   size, so that the second adjustmen

AArch64: Add UNSPECV_PATCHABLE_AREA [PR98776]

2022-11-30 Thread Pop, Sebastian via Gcc-patches

Hi,

Currently patchable area is at the wrong place on AArch64.  It is placed
immediately after function label, before .cfi_startproc.  This patch
adds UNSPECV_PATCHABLE_AREA for pseudo patchable area instruction and
modifies aarch64_print_patchable_function_entry to avoid placing
patchable area before .cfi_startproc.

The patch passed bootstrap and regression test on aarch64-linux.
Ok to commit to trunk and backport to active release branches?

Thanks,
Sebastian

gcc/
PR target/93492
* config/aarch64/aarch64-protos.h (aarch64_output_patchable_area):
Declared.
* config/aarch64/aarch64.cc (aarch64_print_patchable_function_entry):
Emit an UNSPECV_PATCHABLE_AREA pseudo instruction.
(aarch64_output_patchable_area): New.
* config/aarch64/aarch64.md (UNSPECV_PATCHABLE_AREA): New.
(patchable_area): Define.

gcc/testsuite/
PR target/93492
* gcc.target/aarch64/pr98776.c: New.


From b9cf87bcdf65f515b38f1851eb95c18aaa180253 Mon Sep 17 00:00:00 2001
From: Sebastian Pop 
Date: Wed, 30 Nov 2022 19:45:24 +
Subject: [PATCH] AArch64: Add UNSPECV_PATCHABLE_AREA [PR98776]

Currently patchable area is at the wrong place on AArch64.  It is placed
immediately after function label, before .cfi_startproc.  This patch
adds UNSPECV_PATCHABLE_AREA for pseudo patchable area instruction and
modifies aarch64_print_patchable_function_entry to avoid placing
patchable area before .cfi_startproc.

gcc/
	PR target/93492
	* config/aarch64/aarch64-protos.h (aarch64_output_patchable_area):
	Declared.
	* config/aarch64/aarch64.cc (aarch64_print_patchable_function_entry):
	Emit an UNSPECV_PATCHABLE_AREA pseudo instruction.
	(aarch64_output_patchable_area): New.
	* config/aarch64/aarch64.md (UNSPECV_PATCHABLE_AREA): New.
	(patchable_area): Define.

gcc/testsuite/
	PR target/93492
	* gcc.target/aarch64/pr98776.c: New.
---
 gcc/config/aarch64/aarch64-protos.h|  2 ++
 gcc/config/aarch64/aarch64.cc  | 24 +-
 gcc/config/aarch64/aarch64.md  | 14 +
 gcc/testsuite/gcc.target/aarch64/pr98776.c | 11 ++
 4 files changed, 50 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/pr98776.c

diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index 4be93c93c26..2fba24d947d 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -1074,4 +1074,6 @@ const char *aarch64_indirect_call_asm (rtx);
 extern bool aarch64_harden_sls_retbr_p (void);
 extern bool aarch64_harden_sls_blr_p (void);
 
+extern void aarch64_output_patchable_area (unsigned int, bool);
+
 #endif /* GCC_AARCH64_PROTOS_H */
diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index e97f3b32f7c..e84b33b958c 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -22684,7 +22684,29 @@ aarch64_print_patchable_function_entry (FILE *file,
   asm_fprintf (file, "\thint\t34 // bti c\n");
 }
 
-  default_print_patchable_function_entry (file, patch_area_size, record_p);
+  if (cfun->machine->label_is_assembled)
+{
+  rtx pa = gen_patchable_area (GEN_INT (patch_area_size),
+   GEN_INT (record_p));
+  basic_block bb = ENTRY_BLOCK_PTR_FOR_FN (cfun)->next_bb;
+  rtx_insn *insn = emit_insn_before (pa, BB_HEAD (bb));
+  INSN_ADDRESSES_NEW (insn, -1);
+}
+  else
+{
+  default_print_patchable_function_entry (file, patch_area_size,
+	  record_p);
+}
+}
+
+/* Output patchable area.  */
+
+void
+aarch64_output_patchable_area (unsigned int patch_area_size, bool record_p)
+{
+  default_print_patchable_function_entry (asm_out_file,
+	  patch_area_size,
+	  record_p);
 }
 
 /* Implement ASM_OUTPUT_DEF_FROM_DECLS.  Output .variant_pcs for aliases.  */
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 76b6898ca04..6501503eb25 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -303,6 +303,7 @@
 UNSPEC_TAG_SPACE		; Translate address to MTE tag address space.
 UNSPEC_LD1RO
 UNSPEC_SALT_ADDR
+UNSPECV_PATCHABLE_AREA
 ])
 
 (define_c_enum "unspecv" [
@@ -7821,6 +7822,19 @@
   [(set_attr "type" "ls64")]
 )
 
+(define_insn "patchable_area"
+  [(unspec_volatile [(match_operand 0 "const_int_operand")
+		 (match_operand 1 "const_int_operand")]
+		UNSPECV_PATCHABLE_AREA)]
+  ""
+{
+  aarch64_output_patchable_area (INTVAL (operands[0]),
+			 INTVAL (operands[1]) != 0);
+  return "";
+}
+  [(set (attr "length") (symbol_ref "INTVAL (operands[0])"))]
+)
+
 ;; AdvSIMD Stuff
 (include "aarch64-simd.md")
 
diff --git a/gcc/testsuite/gcc.target/aarch64/pr98776.c b/gcc/testsuite/gcc.target/aarch64/pr98776.c
new file mode 100644
index 000..b075b8f75ef
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr98776.c
@@ -0,0 +1,11 @@
+/* { dg-do "compile" } */
+/* { dg-options "-O1 -fpatc

[committed 7/7] analyzer: fix i18n issues in symbolic out-of-bounds [PR106626]

2022-11-30 Thread David Malcolm via Gcc-patches

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r13-4431-geaaf97b6147095.

gcc/analyzer/ChangeLog:
PR analyzer/106626
* bounds-checking.cc
(symbolic_past_the_end::describe_final_event): Delete, moving to
symbolic_buffer_overflow::describe_final_event and
symbolic_buffer_over_read::describe_final_event, eliminating
composition of text strings via "byte_str" and "m_dir_str".
(symbolic_past_the_end::m_dir_str): Delete field.
(symbolic_buffer_overflow::symbolic_buffer_overflow): Drop
m_dir_str.
(symbolic_buffer_overflow::describe_final_event): New, as noted
above.
(symbolic_buffer_over_read::symbolic_buffer_overflow): Drop
m_dir_str.
(symbolic_buffer_over_read::describe_final_event): New, as noted
above.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/bounds-checking.cc | 192 +++-
 1 file changed, 138 insertions(+), 54 deletions(-)

diff --git a/gcc/analyzer/bounds-checking.cc b/gcc/analyzer/bounds-checking.cc
index aaf3f22109b..1c44790f86d 100644
--- a/gcc/analyzer/bounds-checking.cc
+++ b/gcc/analyzer/bounds-checking.cc
@@ -544,62 +544,10 @@ public:
 return label_text ();
   }
 
-  label_text
-  describe_final_event (const evdesc::final_event &ev) final override
-  {
-const char *byte_str;
-if (pending_diagnostic::same_tree_p (m_num_bytes, integer_one_node))
-  byte_str = "byte";
-else
-  byte_str = "bytes";
-
-if (m_offset)
-  {
-   if (m_num_bytes && TREE_CODE (m_num_bytes) == INTEGER_CST)
- {
-   if (m_diag_arg)
- return ev.formatted_print ("%s of %E %s at offset %qE"
-" exceeds %qE", m_dir_str,
-m_num_bytes, byte_str,
-m_offset, m_diag_arg);
-   else
- return ev.formatted_print ("%s of %E %s at offset %qE"
-" exceeds the buffer", m_dir_str,
-m_num_bytes, byte_str, m_offset);
- }
-   else if (m_num_bytes)
- {
-   if (m_diag_arg)
- return ev.formatted_print ("%s of %qE %s at offset %qE"
-" exceeds %qE", m_dir_str,
-m_num_bytes, byte_str,
-m_offset, m_diag_arg);
-   else
- return ev.formatted_print ("%s of %qE %s at offset %qE"
-" exceeds the buffer", m_dir_str,
-m_num_bytes, byte_str, m_offset);
- }
-   else
- {
-   if (m_diag_arg)
- return ev.formatted_print ("%s at offset %qE exceeds %qE",
-m_dir_str, m_offset, m_diag_arg);
-   else
- return ev.formatted_print ("%s at offset %qE exceeds the"
-" buffer", m_dir_str, m_offset);
- }
-  }
-if (m_diag_arg)
-  return ev.formatted_print ("out-of-bounds %s on %qE",
-m_dir_str, m_diag_arg);
-return ev.formatted_print ("out-of-bounds %s", m_dir_str);
-  }
-
 protected:
   tree m_offset;
   tree m_num_bytes;
   tree m_capacity;
-  const char *m_dir_str;
 };
 
 /* Concrete subclass to complain about overflows with symbolic values.  */
@@ -611,7 +559,6 @@ public:
tree num_bytes, tree capacity)
   : symbolic_past_the_end (reg, diag_arg, offset, num_bytes, capacity)
   {
-m_dir_str = "write";
   }
 
   const char *get_kind () const final override
@@ -638,6 +585,75 @@ public:
 "heap-based buffer overflow");
   }
   }
+
+  label_text
+  describe_final_event (const evdesc::final_event &ev) final override
+  {
+if (m_offset)
+  {
+   /* Known offset.  */
+   if (m_num_bytes)
+ {
+   /* Known offset, known size.  */
+   if (TREE_CODE (m_num_bytes) == INTEGER_CST)
+ {
+   /* Known offset, known constant size.  */
+   if (pending_diagnostic::same_tree_p (m_num_bytes,
+integer_one_node))
+ {
+   /* Singular m_num_bytes.  */
+   if (m_diag_arg)
+ return ev.formatted_print
+   ("write of %E byte at offset %qE exceeds %qE",
+m_num_bytes, m_offset, m_diag_arg);
+   else
+ return ev.formatted_print
+   ("write of %E byte at offset %qE exceeds the buffer",
+m_num_bytes, m_offset);
+ }
+   else
+ {
+   /* Plural m_num_bytes.  */
+

[committed 1/7] analyzer: move bounds checking to a new bounds-checking.cc

2022-11-30 Thread David Malcolm via Gcc-patches

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r13-4425-gb82b361af888a1.

gcc/ChangeLog:
* Makefile.in (ANALYZER_OBJS): Add analyzer/bounds-checking.o.

gcc/analyzer/ChangeLog:
* bounds-checking.cc: New file, taken from region-model.cc.
* region-model.cc (class out_of_bounds): Move to
bounds-checking.cc.
(class past_the_end): Likewise.
(class buffer_overflow): Likewise.
(class buffer_overread): Likewise.
(class buffer_underflow): Likewise.
(class buffer_underread): Likewise.
(class symbolic_past_the_end): Likewise.
(class symbolic_buffer_overflow): Likewise.
(class symbolic_buffer_overread): Likewise.
(region_model::check_symbolic_bounds): Likewise.
(maybe_get_integer_cst_tree): Likewise.
(region_model::check_region_bounds): Likewise.
* region-model.h: Add comment.

Signed-off-by: David Malcolm 
---
 gcc/Makefile.in |   1 +
 gcc/analyzer/bounds-checking.cc | 695 
 gcc/analyzer/region-model.cc| 653 --
 gcc/analyzer/region-model.h |   2 +
 4 files changed, 698 insertions(+), 653 deletions(-)
 create mode 100644 gcc/analyzer/bounds-checking.cc

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index fa5e5b444bb..615a07089ee 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1255,6 +1255,7 @@ ANALYZER_OBJS = \
analyzer/analyzer-pass.o \
analyzer/analyzer-selftests.o \
analyzer/bar-chart.o \
+   analyzer/bounds-checking.o \
analyzer/call-info.o \
analyzer/call-string.o \
analyzer/call-summary.o \
diff --git a/gcc/analyzer/bounds-checking.cc b/gcc/analyzer/bounds-checking.cc
new file mode 100644
index 000..19aaa51e6a8
--- /dev/null
+++ b/gcc/analyzer/bounds-checking.cc
@@ -0,0 +1,695 @@
+/* Bounds-checking of reads and writes to memory regions.
+   Copyright (C) 2019-2022 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#define INCLUDE_MEMORY
+#include "system.h"
+#include "coretypes.h"
+#include "make-unique.h"
+#include "tree.h"
+#include "function.h"
+#include "basic-block.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "diagnostic-core.h"
+#include "diagnostic-metadata.h"
+#include "analyzer/analyzer.h"
+#include "analyzer/analyzer-logging.h"
+#include "analyzer/region-model.h"
+
+#if ENABLE_ANALYZER
+
+namespace ana {
+
+/* Abstract base class for all out-of-bounds warnings with concrete values.  */
+
+class out_of_bounds : public pending_diagnostic_subclass
+{
+public:
+  out_of_bounds (const region *reg, tree diag_arg,
+byte_range out_of_bounds_range)
+  : m_reg (reg), m_diag_arg (diag_arg),
+m_out_of_bounds_range (out_of_bounds_range)
+  {}
+
+  const char *get_kind () const final override
+  {
+return "out_of_bounds_diagnostic";
+  }
+
+  bool operator== (const out_of_bounds &other) const
+  {
+return m_reg == other.m_reg
+  && m_out_of_bounds_range == other.m_out_of_bounds_range
+  && pending_diagnostic::same_tree_p (m_diag_arg, other.m_diag_arg);
+  }
+
+  int get_controlling_option () const final override
+  {
+return OPT_Wanalyzer_out_of_bounds;
+  }
+
+  void mark_interesting_stuff (interesting_t *interest) final override
+  {
+interest->add_region_creation (m_reg);
+  }
+
+protected:
+  const region *m_reg;
+  tree m_diag_arg;
+  byte_range m_out_of_bounds_range;
+};
+
+/* Abstract subclass to complaing about out-of-bounds
+   past the end of the buffer.  */
+
+class past_the_end : public out_of_bounds
+{
+public:
+  past_the_end (const region *reg, tree diag_arg, byte_range range,
+   tree byte_bound)
+  : out_of_bounds (reg, diag_arg, range), m_byte_bound (byte_bound)
+  {}
+
+  bool operator== (const past_the_end &other) const
+  {
+return out_of_bounds::operator== (other)
+  && pending_diagnostic::same_tree_p (m_byte_bound,
+  other.m_byte_bound);
+  }
+
+  label_text
+  describe_region_creation_event (const evdesc::region_creation &ev) final
+  override
+  {
+if (m_byte_bound && TREE_CODE (m_byte_bound) == INTEGER_CST)
+  return ev.formatted_print ("capacity is %E bytes", m_byte_bound);
+
+return label_text ();

[committed 5/7] diagnostics: tweak diagnostic_path::interprocedural_p [PR106626]

2022-11-30 Thread David Malcolm via Gcc-patches

The region-creation event at the start of...

: In function 'int_arr_write_element_after_end_off_by_one':
:14:11: warning: buffer overflow [CWE-787] [-Wanalyzer-out-of-bounds]
   14 |   arr[10] = x;
  |   ^~~
  event 1
|
|   10 | int32_t arr[10];
|  | ^~~
|  | |
|  | (1) capacity is 40 bytes
|
+--> 'int_arr_write_element_after_end_off_by_one': events 2-3
   |
   |   12 | void int_arr_write_element_after_end_off_by_one(int32_t x)
   |  |  ^~
   |  |  |
   |  |  (2) entry to 
'int_arr_write_element_after_end_off_by_one'
   |   13 | {
   |   14 |   arr[10] = x;  /* { dg-line line } */
   |  |   ~~~
   |  |   |
   |  |   (3) out-of-bounds write from byte 40 till byte 43 
but 'arr' ends at byte 40
   |
:14:11: note: write of 4 bytes to beyond the end of 'arr'
   14 |   arr[10] = x;
  |   ^~~
:14:11: note: valid subscripts for 'arr' are '[0]' to '[9]'

...makes diagnostic_manager::finish_pruning consider the path to be
interprocedural, and so it doesn't prune the function entry event.

This patch tweaks diagnostic_path::interprocedural_p to ignore
leading events outside of any function, so that it considers the
path to be intraprocedural, and thus diagnostic_manager::finish_pruning
prunes the function entry event, leading to this simpler output:

: In function 'int_arr_write_element_after_end_off_by_one':
:14:11: warning: buffer overflow [CWE-787] [-Wanalyzer-out-of-bounds]
   14 |   arr[10] = x;
  |   ^~~
  event 1
|
|   10 | int32_t arr[10];
|  | ^~~
|  | |
|  | (1) capacity is 40 bytes
|
+--> 'int_arr_write_element_after_end_off_by_one': event 2
   |
   |   14 |   arr[10] = x;
   |  |   ^~~
   |  |   |
   |  |   (2) out-of-bounds write from byte 40 till byte 43 
but 'arr' ends at byte 40
   |
:14:11: note: write of 4 bytes to beyond the end of 'arr'
:14:11: note: valid subscripts for 'arr' are '[0]' to '[9]'

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r13-4429-g1d86af242bc4a8.

gcc/ChangeLog:
PR analyzer/106626
* diagnostic-path.h
(diagnostic_path::get_first_event_in_a_function): New decl.
* diagnostic.cc (diagnostic_path::get_first_event_in_a_function):
New.
(diagnostic_path::interprocedural_p): Ignore leading events that
are outside of any function.

gcc/testsuite/ChangeLog:
PR analyzer/106626
* gcc.dg/analyzer/out-of-bounds-multiline-1.c: New test.

Signed-off-by: David Malcolm 
---
 gcc/diagnostic-path.h |  3 ++
 gcc/diagnostic.cc | 37 +--
 .../analyzer/out-of-bounds-multiline-1.c  | 37 +++
 3 files changed, 74 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/out-of-bounds-multiline-1.c

diff --git a/gcc/diagnostic-path.h b/gcc/diagnostic-path.h
index 8ce4ff763d4..aa5cda8c23a 100644
--- a/gcc/diagnostic-path.h
+++ b/gcc/diagnostic-path.h
@@ -167,6 +167,9 @@ class diagnostic_path
   virtual const diagnostic_event & get_event (int idx) const = 0;
 
   bool interprocedural_p () const;
+
+private:
+  bool get_first_event_in_a_function (unsigned *out_idx) const;
 };
 
 /* Concrete subclasses.  */
diff --git a/gcc/diagnostic.cc b/gcc/diagnostic.cc
index a9562a815b1..322515b3242 100644
--- a/gcc/diagnostic.cc
+++ b/gcc/diagnostic.cc
@@ -939,18 +939,49 @@ diagnostic_event::meaning::maybe_get_property_str (enum 
property p)
 
 /* class diagnostic_path.  */
 
+/* Subroutint of diagnostic_path::interprocedural_p.
+   Look for the first event in this path that is within a function
+   i.e. has a non-NULL fndecl, and a non-zero stack depth.
+   If found, write its index to *OUT_IDX and return true.
+   Otherwise return false.  */
+
+bool
+diagnostic_path::get_first_event_in_a_function (unsigned *out_idx) const
+{
+  const unsigned num = num_events ();
+  for (unsigned i = 0; i < num; i++)
+{
+  if (!(get_event (i).get_fndecl () == NULL
+   && get_event (i).get_stack_depth () == 0))
+   {
+ *out_idx = i;
+ return true;
+   }
+}
+  return false;
+}
+
 /* Return true if the events in this path involve more than one
function, or false if it is purely intraprocedural.  */
 
 bool
 diagnostic_path::interprocedural_p () const
 {
+  /* Ignore leading events that are outside of any function.  */
+  unsigned first_fn_event_idx;
+  if (!get_first_event_in_a_function (&first_fn_event_idx))
+return false;
+
+  const diagnostic_event &first_fn_event = get_event (first_fn_event_idx);
+  tree first_fndecl = first_fn_event.get_fnd

[committed 4/7] analyzer: more bounds-checking wording tweaks [PR106626]

2022-11-30 Thread David Malcolm via Gcc-patches

This patch tweaks the wording of -Wanalyzer-out-of-bounds:

* use the spellings/terminology of CWE:
  * replace "underread" with "under-read", as per:
 https://cwe.mitre.org/data/definitions/127.html
  * replace "overread" with "over-read" as per:
 https://cwe.mitre.org/data/definitions/126.html
  * replace "underflow" with "underwrite" as per:
https://cwe.mitre.org/data/definitions/124.html

* wherever known, specify the memory region of the bad access,
so that it says e.g. "heap-based buffer over-read"
or "stack-based buffer over-read"

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r13-4428-gdf460cf51b2586.

gcc/analyzer/ChangeLog:
PR analyzer/106626
* bounds-checking.cc (out_of_bounds::get_memory_space): New.
(buffer_overflow::emit): Use it.
(class buffer_overread): Rename to...
(class buffer_over_read): ...this.
(buffer_over_read::emit): Specify which memory space the read is
from, where known.  Change "overread" to "over-read".
(class buffer_underflow): Rename to...
(class buffer_underwrite): ...this.
(buffer_underwrite::emit): Specify which memory space the write is
to, where known.  Change "underflow" to "underwrite".
(class buffer_underread): Rename to...
(class buffer_under_read): Rename to...
(buffer_under_read::emit): Specify which memory space the read is
from, where known.  Change "underread" to "under-read".
(symbolic_past_the_end::get_memory_space): New.
(symbolic_buffer_overflow::emit): Use it.
(class symbolic_buffer_overread): Rename to...
(class symbolic_buffer_over_read): ...this.
(symbolic_buffer_over_read::emit): Specify which memory space the
read is from, where known.  Change "overread" to "over-read".
(region_model::check_symbolic_bounds): Update for class renaming.
(region_model::check_region_bounds): Likewise.

gcc/testsuite/ChangeLog:
PR analyzer/106626
* gcc.dg/analyzer/call-summaries-2.c: Update expected results.
* gcc.dg/analyzer/out-of-bounds-1.c: Likewise.
* gcc.dg/analyzer/out-of-bounds-2.c: Likewise.
* gcc.dg/analyzer/out-of-bounds-3.c: Likewise.
* gcc.dg/analyzer/out-of-bounds-4.c: Likewise.
* gcc.dg/analyzer/out-of-bounds-5.c: Likewise.
* gcc.dg/analyzer/out-of-bounds-container_of.c: Likewise.
* gcc.dg/analyzer/out-of-bounds-read-char-arr.c: Likewise.  Rename
functions from "int_arr_" to "char_arr_".
* gcc.dg/analyzer/out-of-bounds-read-int-arr.c: Update expected
results.
* gcc.dg/analyzer/out-of-bounds-read-struct-arr.c: New test.
* gcc.dg/analyzer/out-of-bounds-write-char-arr.c: Update expected
results.  Rename functions from "int_arr_" to "char_arr_".
* gcc.dg/analyzer/out-of-bounds-write-int-arr.c: Update expected
results.
* gcc.dg/analyzer/out-of-bounds-write-struct-arr.c: New test.
* gcc.dg/analyzer/pr101962.c: Update expected results.
* gcc.dg/analyzer/realloc-5.c: Update expected results.
* gcc.dg/analyzer/zlib-3.c: Update expected results.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/bounds-checking.cc   | 133 +-
 .../gcc.dg/analyzer/call-summaries-2.c|   2 +-
 .../gcc.dg/analyzer/out-of-bounds-1.c |   4 +-
 .../gcc.dg/analyzer/out-of-bounds-2.c |  15 +-
 .../gcc.dg/analyzer/out-of-bounds-3.c |  27 ++--
 .../gcc.dg/analyzer/out-of-bounds-4.c |  15 +-
 .../gcc.dg/analyzer/out-of-bounds-5.c |  20 +--
 .../analyzer/out-of-bounds-container_of.c |   4 +-
 .../analyzer/out-of-bounds-read-char-arr.c|  34 +++--
 .../analyzer/out-of-bounds-read-int-arr.c |  18 ++-
 .../analyzer/out-of-bounds-read-struct-arr.c  |  65 +
 .../analyzer/out-of-bounds-write-char-arr.c   |  22 +--
 .../analyzer/out-of-bounds-write-int-arr.c|   6 +-
 .../analyzer/out-of-bounds-write-struct-arr.c |  65 +
 gcc/testsuite/gcc.dg/analyzer/pr101962.c  |   2 +-
 gcc/testsuite/gcc.dg/analyzer/realloc-5.c |   2 +-
 gcc/testsuite/gcc.dg/analyzer/zlib-3.c|   2 +-
 17 files changed, 327 insertions(+), 109 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.dg/analyzer/out-of-bounds-read-struct-arr.c
 create mode 100644 
gcc/testsuite/gcc.dg/analyzer/out-of-bounds-write-struct-arr.c

diff --git a/gcc/analyzer/bounds-checking.cc b/gcc/analyzer/bounds-checking.cc
index b02bc79a926..bc7d2dd17ae 100644
--- a/gcc/analyzer/bounds-checking.cc
+++ b/gcc/analyzer/bounds-checking.cc
@@ -71,6 +71,11 @@ public:
   }
 
 protected:
+  enum memory_space get_memory_space () const
+  {
+return m_reg->get_memory_space ();
+  }
+
   /* Potentially add a note about valid ways to index this array, such
  as (given "int arr[10];"):
note: valid subscripts for 'arr' are '[0]' to '[9]'
@@ -150,7 +15

[committed 6/7] analyzer: unify bounds-checking class hierarchies

2022-11-30 Thread David Malcolm via Gcc-patches

Convert out-of-bounds class hierarchy from:

  pending_diagnostic
out_of_bounds
  past_the_end
buffer_overflow (*)
buffer_over_read (*)
  buffer_underwrite (*)
  buffer_under_read (*)
symbolic_past_the_end
  symbolic_buffer_overflow (*)
  symbolic_buffer_over_read (*)

to:

  pending_diagnostic
out_of_bounds
  concrete_out_of_bounds
concrete_past_the_end
  concrete_buffer_overflow (*)
  concrete_buffer_over_read (*)
concrete_buffer_underwrite (*)
concrete_buffer_under_read (*)
  symbolic_past_the_end
symbolic_buffer_overflow (*)
symbolic_buffer_over_read (*)

where the concrete classes (i.e. the instantiable ones) are marked
with a (*).

Doing so undercovered a bug where, for CWE-131-examples.c, we were
emitting an extra:
  warning: heap-based buffer over-read [CWE-122] [-Wanalyzer-out-of-bounds]
at the:
  WidgetList[numWidgets] = NULL;
The issue was that within set_next_state we get the rvalue for the LHS,
which looks like a read to the bounds-checker.  The patch fixes this by
passing NULL as the region_model_context * for such accesses.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r13-4430-g8bc9e4ee874ea3.

gcc/analyzer/ChangeLog:
* bounds-checking.cc (class out_of_bounds): Split out from...
(class concrete_out_of_bounds): New abstract subclass.
(class past_the_end): Rename to...
(class concrete_past_the_end): ...this, and make a subclass of
concrete_out_of_bounds.
(class buffer_overflow): Rename to...
(class concrete_buffer_overflow): ...this, and make a subclass of
concrete_past_the_end.
(class buffer_over_read): Rename to...
(class concrete_buffer_over_read): ...this, and make a subclass of
concrete_past_the_end.
(class buffer_underwrite): Rename to...
(class concrete_buffer_underwrite): ...this, and make a subclass
of concrete_out_of_bounds.
(class buffer_under_read): Rename to...
(class concrete_buffer_under_read): ...this, and make a subclass
of concrete_out_of_bounds.
(class symbolic_past_the_end): Convert to a subclass of
out_of_bounds.
(symbolic_buffer_overflow::get_kind): New.
(symbolic_buffer_over_read::get_kind): New.
(region_model::check_region_bounds): Update for renamings.
* engine.cc (impl_sm_context::set_next_state): Eliminate
"new_ctxt", passing NULL to get_rvalue instead.
(impl_sm_context::warn): Likewise.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/bounds-checking.cc | 185 +++-
 gcc/analyzer/engine.cc  |  24 +
 2 files changed, 115 insertions(+), 94 deletions(-)

diff --git a/gcc/analyzer/bounds-checking.cc b/gcc/analyzer/bounds-checking.cc
index bc7d2dd17ae..aaf3f22109b 100644
--- a/gcc/analyzer/bounds-checking.cc
+++ b/gcc/analyzer/bounds-checking.cc
@@ -37,27 +37,21 @@ along with GCC; see the file COPYING3.  If not see
 
 namespace ana {
 
-/* Abstract base class for all out-of-bounds warnings with concrete values.  */
+/* Abstract base class for all out-of-bounds warnings.  */
 
-class out_of_bounds : public pending_diagnostic_subclass
+class out_of_bounds : public pending_diagnostic
 {
 public:
-  out_of_bounds (const region *reg, tree diag_arg,
-byte_range out_of_bounds_range)
-  : m_reg (reg), m_diag_arg (diag_arg),
-m_out_of_bounds_range (out_of_bounds_range)
+  out_of_bounds (const region *reg, tree diag_arg)
+  : m_reg (reg), m_diag_arg (diag_arg)
   {}
 
-  const char *get_kind () const final override
-  {
-return "out_of_bounds_diagnostic";
-  }
-
-  bool operator== (const out_of_bounds &other) const
+  bool subclass_equal_p (const pending_diagnostic &base_other) const override
   {
-return m_reg == other.m_reg
-  && m_out_of_bounds_range == other.m_out_of_bounds_range
-  && pending_diagnostic::same_tree_p (m_diag_arg, other.m_diag_arg);
+const out_of_bounds &other
+  (static_cast (base_other));
+return (m_reg == other.m_reg
+   && pending_diagnostic::same_tree_p (m_diag_arg, other.m_diag_arg));
   }
 
   int get_controlling_option () const final override
@@ -106,25 +100,51 @@ protected:
 
   const region *m_reg;
   tree m_diag_arg;
+};
+
+/* Abstract base class for all out-of-bounds warnings where the
+   out-of-bounds range is concrete.  */
+
+class concrete_out_of_bounds : public out_of_bounds
+{
+public:
+  concrete_out_of_bounds (const region *reg, tree diag_arg,
+ byte_range out_of_bounds_range)
+  : out_of_bounds (reg, diag_arg),
+m_out_of_bounds_range (out_of_bounds_range)
+  {}
+
+  bool subclass_equal_p (const pending_diagnostic &base_other) const override
+  {
+const concrete_out_of_bounds &other
+  (static_cast (base_other));
+return (out_of_bounds::subclass_equal_p (othe

[committed 3/7] analyzer: add note about valid subscripts [PR106626]

2022-11-30 Thread David Malcolm via Gcc-patches

Consider -fanalyzer on:

#include 

int32_t arr[10];

void int_arr_write_element_after_end_off_by_one(int32_t x)
{
  arr[10] = x;
}

Trunk x86_64: https://godbolt.org/z/17zn3qYY4

Currently we emit:

: In function 'int_arr_write_element_after_end_off_by_one':
:7:11: warning: buffer overflow [CWE-787] [-Wanalyzer-out-of-bounds]
7 |   arr[10] = x;
  |   ^~~
  event 1
|
|3 | int32_t arr[10];
|  | ^~~
|  | |
|  | (1) capacity is 40 bytes
|
+--> 'int_arr_write_element_after_end_off_by_one': events 2-3
   |
   |5 | void int_arr_write_element_after_end_off_by_one(int32_t x)
   |  |  ^~
   |  |  |
   |  |  (2) entry to 
'int_arr_write_element_after_end_off_by_one'
   |6 | {
   |7 |   arr[10] = x;
   |  |   ~~~
   |  |   |
   |  |   (3) out-of-bounds write from byte 40 till byte 43 
but 'arr' ends at byte 40
   |
:7:11: note: write of 4 bytes to beyond the end of 'arr'
7 |   arr[10] = x;
  |   ^~~

This is worded in terms of bytes, due to the way -Wanalyzer-out-of-bounds
is implemented, but this isn't what the user wrote.

This patch tries to get closer to the user's code by adding a note about
array bounds when we're referring to an array.  In the above example it
adds this trailing note:

  note: valid subscripts for 'arr' are '[0]' to '[9]'

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r13-4427-g7c655699ed51b0.

gcc/analyzer/ChangeLog:
PR analyzer/106626
* bounds-checking.cc (out_of_bounds::maybe_describe_array_bounds):
New.
(buffer_overflow::emit): Call maybe_describe_array_bounds.
(buffer_overread::emit): Likewise.
(buffer_underflow::emit): Likewise.
(buffer_underread::emit): Likewise.

gcc/testsuite/ChangeLog:
PR analyzer/106626
* gcc.dg/analyzer/call-summaries-2.c: Add dg-message for expected
note about valid indexes.
* gcc.dg/analyzer/out-of-bounds-1.c: Likewise, fixing up existing
dg-message directives.
* gcc.dg/analyzer/out-of-bounds-write-char-arr.c: Likewise.
* gcc.dg/analyzer/out-of-bounds-write-int-arr.c: Likewise.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/bounds-checking.cc   | 46 +--
 .../gcc.dg/analyzer/call-summaries-2.c|  1 +
 .../gcc.dg/analyzer/out-of-bounds-1.c | 16 ---
 .../analyzer/out-of-bounds-write-char-arr.c   |  6 +++
 .../analyzer/out-of-bounds-write-int-arr.c|  6 +++
 5 files changed, 64 insertions(+), 11 deletions(-)

diff --git a/gcc/analyzer/bounds-checking.cc b/gcc/analyzer/bounds-checking.cc
index ad7f431ea2f..b02bc79a926 100644
--- a/gcc/analyzer/bounds-checking.cc
+++ b/gcc/analyzer/bounds-checking.cc
@@ -71,6 +71,34 @@ public:
   }
 
 protected:
+  /* Potentially add a note about valid ways to index this array, such
+ as (given "int arr[10];"):
+   note: valid subscripts for 'arr' are '[0]' to '[9]'
+ We print the '[' and ']' characters so as to express the valid
+ subscripts using C syntax, rather than just as byte ranges,
+ which hopefully is more clear to the user.  */
+  void
+  maybe_describe_array_bounds (location_t loc) const
+  {
+if (!m_diag_arg)
+  return;
+tree t = TREE_TYPE (m_diag_arg);
+if (!t)
+  return;
+if (TREE_CODE (t) != ARRAY_TYPE)
+  return;
+tree domain = TYPE_DOMAIN (t);
+if (!domain)
+  return;
+tree max_idx = TYPE_MAX_VALUE (domain);
+if (!max_idx)
+  return;
+tree min_idx = TYPE_MIN_VALUE (domain);
+inform (loc,
+   "valid subscripts for %qE are %<[%E]%> to %<[%E]%>",
+   m_diag_arg, min_idx, max_idx);
+  }
+
   const region *m_reg;
   tree m_diag_arg;
   byte_range m_out_of_bounds_range;
@@ -165,6 +193,8 @@ public:
  inform (rich_loc->get_loc (),
  "write to beyond the end of %qE",
  m_diag_arg);
+
+   maybe_describe_array_bounds (rich_loc->get_loc ());
   }
 
 return warned;
@@ -245,6 +275,8 @@ public:
  inform (rich_loc->get_loc (),
  "read from after the end of %qE",
  m_diag_arg);
+
+   maybe_describe_array_bounds (rich_loc->get_loc ());
   }
 
 return warned;
@@ -297,8 +329,11 @@ public:
   {
 diagnostic_metadata m;
 m.add_cwe (124);
-return warning_meta (rich_loc, m, get_controlling_option (),
-"buffer underflow");
+bool warned = warning_meta (rich_loc, m, get_controlling_option (),
+   "buffer underflow");
+if (warned)
+  maybe_describe_array_bounds (rich_loc->get_loc ());
+return warned;
   }
 
   label_text describe_final_event (const evdesc::final_event &ev)
@@ -346,

[committed 2/7] analyzer: fix wording of 'number of bad bytes' note [PR106626]

2022-11-30 Thread David Malcolm via Gcc-patches

Consider -fanalyzer on:

#include 

int32_t arr[10];

void int_arr_write_element_after_end_far(int32_t x)
{
  arr[100] = x;
}

Trunk x86_64: https://godbolt.org/z/7GqEcYGq6

Currently we emit:

: In function 'int_arr_write_element_after_end_far':
:7:12: warning: buffer overflow [CWE-787] [-Wanalyzer-out-of-bounds]
7 |   arr[100] = x;
  |   ~^~~
  event 1
|
|3 | int32_t arr[10];
|  | ^~~
|  | |
|  | (1) capacity is 40 bytes
|
+--> 'int_arr_write_element_after_end_far': events 2-3
   |
   |5 | void int_arr_write_element_after_end_far(int32_t x)
   |  |  ^~~
   |  |  |
   |  |  (2) entry to 'int_arr_write_element_after_end_far'
   |6 | {
   |7 |   arr[100] = x;
   |  |   
   |  ||
   |  |(3) out-of-bounds write from byte 400 till byte 
403 but 'arr' ends at byte 40
   |
:7:12: note: write is 4 bytes past the end of 'arr'
7 |   arr[100] = x;
  |   ~^~~

The wording of the final note:
  "write is 4 bytes past the end of 'arr'"
reads to me as if the "4 bytes past" is describing where the access
occurs, which seems wrong, as the write is far beyond the end of the
array.  Looking at the implementation, it's actually describing the
number of bytes within the access that are beyond the bounds of the
buffer.

This patch updates the wording so that the final note reads
  "write of 4 bytes to beyond the end of 'arr'"
which more clearly expresses that it's the size of the access
being described.

The patch also uses inform_n to avoid emitting "1 bytes".

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r13-4426-gd69a95c12cc91e.

gcc/analyzer/ChangeLog:
PR analyzer/106626
* bounds-checking.cc (buffer_overflow::emit): Use inform_n.
Update wording to clarify that we're talking about the size of
the bad access, rather than its position.
(buffer_overread::emit): Likewise.

gcc/testsuite/ChangeLog:
PR analyzer/106626
* gcc.dg/analyzer/out-of-bounds-read-char-arr.c: Update for
changes to expected wording.
* gcc.dg/analyzer/out-of-bounds-read-int-arr.c: Likewise.
* gcc.dg/analyzer/out-of-bounds-write-char-arr.c: Likewise.
* gcc.dg/analyzer/out-of-bounds-write-int-arr.c: Likewise.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/bounds-checking.cc   | 66 ---
 .../analyzer/out-of-bounds-read-char-arr.c| 11 +---
 .../analyzer/out-of-bounds-read-int-arr.c |  8 +--
 .../analyzer/out-of-bounds-write-char-arr.c   | 11 +---
 .../analyzer/out-of-bounds-write-int-arr.c|  8 +--
 5 files changed, 56 insertions(+), 48 deletions(-)

diff --git a/gcc/analyzer/bounds-checking.cc b/gcc/analyzer/bounds-checking.cc
index 19aaa51e6a8..ad7f431ea2f 100644
--- a/gcc/analyzer/bounds-checking.cc
+++ b/gcc/analyzer/bounds-checking.cc
@@ -143,17 +143,28 @@ public:
 
 if (warned)
   {
-   char num_bytes_past_buf[WIDE_INT_PRINT_BUFFER_SIZE];
-   print_dec (m_out_of_bounds_range.m_size_in_bytes,
-  num_bytes_past_buf, UNSIGNED);
-   if (m_diag_arg)
- inform (rich_loc->get_loc (), "write is %s bytes past the end"
-   " of %qE", num_bytes_past_buf,
-  m_diag_arg);
-   else
- inform (rich_loc->get_loc (), "write is %s bytes past the end"
-   "of the region",
-   num_bytes_past_buf);
+   if (wi::fits_uhwi_p (m_out_of_bounds_range.m_size_in_bytes))
+ {
+   unsigned HOST_WIDE_INT num_bad_bytes
+ = m_out_of_bounds_range.m_size_in_bytes.to_uhwi ();
+   if (m_diag_arg)
+ inform_n (rich_loc->get_loc (),
+   num_bad_bytes,
+   "write of %wu byte to beyond the end of %qE",
+   "write of %wu bytes to beyond the end of %qE",
+   num_bad_bytes,
+   m_diag_arg);
+   else
+ inform_n (rich_loc->get_loc (),
+   num_bad_bytes,
+   "write of %wu byte to beyond the end of the region",
+   "write of %wu bytes to beyond the end of the region",
+   num_bad_bytes);
+ }
+   else if (m_diag_arg)
+ inform (rich_loc->get_loc (),
+ "write to beyond the end of %qE",
+ m_diag_arg);
   }
 
 return warned;
@@ -212,17 +223,28 @@ public:
 
 if (warned)
   {
-   char num_bytes_past_buf[WIDE_INT_PRINT_BUFFER_SIZE];
-   print_dec (m_out_of_bounds_range.m_size_in_bytes,
-  num_bytes_past_buf, UNS

[committed] analyzer: fix ICE on bind/connect with a constant fd [PR107928]

2022-11-30 Thread David Malcolm via Gcc-patches

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r13-4424-g45a75fd3d31265.

gcc/analyzer/ChangeLog:
PR analyzer/107928
* sm-fd.cc (fd_state_machine::on_bind): Handle m_constant_fd in
the "success" outcome.
(fd_state_machine::on_connect): Likewise.
* sm-fd.dot: Add "constant_fd" state and its transitions.

gcc/testsuite/ChangeLog:
PR analyzer/107928
* gcc.dg/analyzer/fd-bind-pr107928.c: New test.
* gcc.dg/analyzer/fd-connect-pr107928.c: New test.
* gcc.dg/analyzer/fd-stream-socket-active-open.c
(test_active_open_from_connect_constant): New, adapted from
test_active_open_from_connect.
* gcc.dg/analyzer/fd-stream-socket-passive-open.c
(test_passive_open_from_bind_constant): New, adapted from
test_passive_open_from_bind.
(test_passive_open_from_listen_constant): New, adapted from
test_passive_open_from_listen.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/sm-fd.cc |  6 +-
 gcc/analyzer/sm-fd.dot|  6 ++
 .../gcc.dg/analyzer/fd-bind-pr107928.c| 10 ++
 .../gcc.dg/analyzer/fd-connect-pr107928.c | 10 ++
 .../analyzer/fd-stream-socket-active-open.c   | 31 ++
 .../analyzer/fd-stream-socket-passive-open.c  | 98 +++
 6 files changed, 159 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/fd-bind-pr107928.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/fd-connect-pr107928.c

diff --git a/gcc/analyzer/sm-fd.cc b/gcc/analyzer/sm-fd.cc
index 794733e55ca..799847cb8e8 100644
--- a/gcc/analyzer/sm-fd.cc
+++ b/gcc/analyzer/sm-fd.cc
@@ -1861,7 +1861,8 @@ fd_state_machine::on_bind (const call_details &cd,
next_state = m_bound_datagram_socket;
   else if (old_state == m_new_unknown_socket)
next_state = m_bound_unknown_socket;
-  else if (old_state == m_start)
+  else if (old_state == m_start
+  || old_state == m_constant_fd)
next_state = m_bound_unknown_socket;
   else if (old_state == m_stop)
next_state = m_stop;
@@ -2116,7 +2117,8 @@ fd_state_machine::on_connect (const call_details &cd,
next_state = m_new_datagram_socket;
   else if (old_state == m_new_unknown_socket)
next_state = m_stop;
-  else if (old_state == m_start)
+  else if (old_state == m_start
+  || old_state == m_constant_fd)
next_state = m_stop;
   else if (old_state == m_stop)
next_state = m_stop;
diff --git a/gcc/analyzer/sm-fd.dot b/gcc/analyzer/sm-fd.dot
index da925b0989f..d7676b1f779 100644
--- a/gcc/analyzer/sm-fd.dot
+++ b/gcc/analyzer/sm-fd.dot
@@ -27,6 +27,9 @@ digraph "fd" {
   /* Start state.  */
   start;
 
+  /* State for a constant file descriptor (>= 0).  */
+  constant_fd;
+
   /* States representing a file descriptor that hasn't yet been
 checked for validity after opening, for three different
 access modes.  */
@@ -129,6 +132,7 @@ digraph "fd" {
 
   /* On "bind".  */
   start -> bound_unknown_socket [label="when 'bind(X, ...)' succeeds"];
+  constant_fd -> bound_unknown_socket [label="when 'bind(X, ...)' succeeds"];
   new_stream_socket -> bound_stream_socket [label="when 'bind(X, ...)' 
succeeds"];
   new_datagram_socket -> bound_datagram_socket [label="when 'bind(X, ...)' 
succeeds"];
   new_unknown_socket -> bound_unknown_socket [label="when 'bind(X, ...)' 
succeeds"];
@@ -140,12 +144,14 @@ digraph "fd" {
 
   /* On "accept".  */
   start -> connected_stream_socket [label="when 'accept(OTHER, ...)' succeeds 
on a listening_stream_socket"];
+  constant_fd -> connected_stream_socket [label="when 'accept(OTHER, ...)' 
succeeds on a listening_stream_socket"];
 
   /* On "connect".  */
   new_stream_socket -> connected_stream_socket [label="when 'connect(X, ...)' 
succeeds"];
   new_datagram_socket -> new_datagram_socket [label="when 'connect(X, ...)' 
succeeds"];
   new_unknown_socket -> stop [label="when 'connect(X, ...)' succeeds"];
   start -> stop [label="when 'connect(X, ...)' succeeds"];
+  constant_fd -> stop [label="when 'connect(X, ...)' succeeds"];
 
   /* on_condition.  */
   unchecked_read_write -> valid_read_write [label="on 'X >= 0'"];
diff --git a/gcc/testsuite/gcc.dg/analyzer/fd-bind-pr107928.c 
b/gcc/testsuite/gcc.dg/analyzer/fd-bind-pr107928.c
new file mode 100644
index 000..acc1a1df8e0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/fd-bind-pr107928.c
@@ -0,0 +1,10 @@
+struct sa {};
+
+int
+bind (int, struct sa *, int);
+
+int
+foo (struct sa sa)
+{
+  return bind (1, &sa, sizeof sa);
+}
diff --git a/gcc/testsuite/gcc.dg/analyzer/fd-connect-pr107928.c 
b/gcc/testsuite/gcc.dg/analyzer/fd-connect-pr107928.c
new file mode 100644
index 000..f3bdc87c210
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/fd-connect-pr107928.c
@@ -0,0 +1,10 @@
+struct sa {};
+
+int
+connect (int, struct sa *, int);
+
+int
+foo (struct sa

Re: [PATCH V2] rs6000: Support to build constants by li/lis+oris/xoris

2022-11-30 Thread Jiufu Guo via Gcc-patches

Date: Thu, 01 Dec 2022 09:51:32 +0800
In-Reply-To: <20221125144309.gg25...@gate.crashing.org> (Segher Boessenkool's
message of "Fri, 25 Nov 2022 08:43:09 -0600")
Message-ID: <7ewn7bx55n@pike.rch.stglabs.ibm.com>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux)

Segher Boessenkool  writes:

> Hi guys,
>
> On Fri, Nov 25, 2022 at 04:11:49PM +0800, Kewen.Lin wrote:
>> on 2022/10/26 19:40, Jiufu Guo wrote:
cut...
>> > +
>> > +  HOST_WIDE_INT imm = (ud1 & 0x8000) ? ((ud1 ^ 0x8000) - 0x8000)
>> > +   : ((ud2 << 16) - 0x8000);
>
> We really should have some "hwi::sign_extend (ud1, 16)" helper function,
> heh.  Maybe there already is?  Ah, "sext_hwi".  Fixing that up
> everywhere in this function is preapproved.

I just submit a patch to use sext_hwi for existing code:
https://gcc.gnu.org/pipermail/gcc-patches/2022-December/607588.html

Thanks for your help and spend time on this!

BR,
Jeff (Jiufu)

>
>> > +  else
cut...
>
> Could you comment what exact instructions are expected?
> li;xoris and li;xoris and lis;xoris I guess?  It helps if you just tell
> the reader here.
>
> The li;oris and li;xoris parts look good.
>
>
> Segher

Re: [PATCH V2] rs6000: Support to build constants by li/lis+oris/xoris

2022-11-30 Thread Jiufu Guo via Gcc-patches

Date: Thu, 01 Dec 2022 09:48:06 +0800
In-Reply-To: <20221128171950.gn25...@gate.crashing.org> (Segher Boessenkool's
message of "Mon, 28 Nov 2022 11:19:50 -0600")
Message-ID: <7e4jufyjvt@pike.rch.stglabs.ibm.com>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux)

Segher Boessenkool  writes:

> On Mon, Nov 28, 2022 at 03:51:59PM +0800, Jiufu Guo wrote:
>> Jiufu Guo via Gcc-patches  writes:
>> > Segher Boessenkool  writes:
>> >>> > +  else
>> >>> > +  {
>> >>> > +emit_move_insn (temp,
>> >>> > +GEN_INT (((ud2 << 16) ^ 0x8000) - 
>> >>> > 0x8000));
>> >>> > +if (ud1 != 0)
>> >>> > +  emit_move_insn (temp, gen_rtx_IOR (DImode, temp, GEN_INT 
>> >>> > (ud1)));
>> >>> > +emit_move_insn (dest,
>> >>> > +gen_rtx_ZERO_EXTEND (DImode,
>> >>> > + gen_lowpart (SImode, 
>> >>> > temp)));
>> >>> > +  }
>> >>
>> >> Why this?  Please just write it in DImode, do not go via SImode?
>> > Thanks for catch this. Yes, gen_lowpart with DImode would be ok.
>> Oh, Sorry. DImode can not be used here.  The genreated pattern with
>> DImode can not be recognized.  Using SImode is to match 'rlwxx'.
>
> There are patterns that accept DImode for rlwinm just fine.  Please use
>   (and:DI (const_int 0x) (x:DI))
> not the obfuscated
>   (zero_extend:DI (subreg:SI (x:DI) LOWBYTE))

I just submit a simple patch for this:
https://gcc.gnu.org/pipermail/gcc-patches/2022-December/607589.html

Thanks for comments!

BR,
Jeff (Jiufu)
>
>
> Segher

[PATCH 3/3]rs6000: NFC no need copy_rtx in rs6000_emit_set_long_const and rs6000_emit_set_const

2022-11-30 Thread Jiufu Guo via Gcc-patches

Hi,

Function rs6000_emit_set_const/rs6000_emit_set_long_const are only invoked from
two "define_split"s where the target operand is limited to gpc_reg_operand or
int_reg_operand, then the operand must be REG_P.
And in rs6000_emit_set_const/rs6000_emit_set_long_const, to create temp rtx,
it is using code like "gen_reg_rtx({S|D}Imode)", it must also be REG_P.
So, copy_rtx is not needed for temp and dest.

This patch removes those "copy_rtx" for rs6000_emit_set_const and
rs6000_emit_set_long_const.

Bootstrap & regtest pass on ppc64{,le}.
Is this ok for trunk? 

BR,
Jeff (Jiufu)

gcc/ChangeLog:

* config/rs6000/rs6000.cc (rs6000_emit_set_const): Remove copy_rtx.
(rs6000_emit_set_long_const): Likewise.

---
 gcc/config/rs6000/rs6000.cc | 58 +
 1 file changed, 20 insertions(+), 38 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 718072cc9a1..1a51b79ebfe 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -10186,10 +10186,9 @@ rs6000_emit_set_const (rtx dest, rtx source)
 case E_SImode:
   temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (SImode);
 
-  emit_insn (gen_rtx_SET (copy_rtx (temp),
- GEN_INT (c & ~(HOST_WIDE_INT) 0x)));
+  emit_insn (gen_rtx_SET (temp, GEN_INT (c & ~(HOST_WIDE_INT) 0x)));
   emit_insn (gen_rtx_SET (dest,
- gen_rtx_IOR (SImode, copy_rtx (temp),
+ gen_rtx_IOR (SImode, temp,
   GEN_INT (c & 0x;
   break;
 
@@ -10198,10 +10197,8 @@ rs6000_emit_set_const (rtx dest, rtx source)
{
  rtx hi, lo;
 
- hi = operand_subword_force (copy_rtx (dest), WORDS_BIG_ENDIAN == 0,
- DImode);
- lo = operand_subword_force (dest, WORDS_BIG_ENDIAN != 0,
- DImode);
+ hi = operand_subword_force (dest, WORDS_BIG_ENDIAN == 0, DImode);
+ lo = operand_subword_force (dest, WORDS_BIG_ENDIAN != 0, DImode);
  emit_move_insn (hi, GEN_INT (c >> 32));
  c = sext_hwi (c, 32);
  emit_move_insn (lo, GEN_INT (c));
@@ -10249,23 +10246,19 @@ rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT c)
 {
   temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode);
 
-  emit_move_insn (ud1 != 0 ? copy_rtx (temp) : dest,
+  emit_move_insn (ud1 != 0 ? temp : dest,
  GEN_INT (sext_hwi (ud2 << 16, 32)));
   if (ud1 != 0)
-   emit_move_insn (dest,
-   gen_rtx_IOR (DImode, copy_rtx (temp),
-GEN_INT (ud1)));
+   emit_move_insn (dest, gen_rtx_IOR (DImode, temp, GEN_INT (ud1)));
 }
   else if (ud3 == 0 && ud4 == 0)
 {
   temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode);
 
   gcc_assert (ud2 & 0x8000);
-  emit_move_insn (copy_rtx (temp), GEN_INT (sext_hwi (ud2 << 16, 32)));
+  emit_move_insn (temp, GEN_INT (sext_hwi (ud2 << 16, 32)));
   if (ud1 != 0)
-   emit_move_insn (copy_rtx (temp),
-   gen_rtx_IOR (DImode, copy_rtx (temp),
-GEN_INT (ud1)));
+   emit_move_insn (temp, gen_rtx_IOR (DImode, temp, GEN_INT (ud1)));
   emit_move_insn (dest, gen_rtx_AND (DImode, temp, GEN_INT (0x)));
 }
   else if (ud1 == ud3 && ud2 == ud4)
@@ -10282,18 +10275,13 @@ rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT c)
 {
   temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode);
 
-  emit_move_insn (copy_rtx (temp), GEN_INT (sext_hwi (ud3 << 16, 32)));
+  emit_move_insn (temp, GEN_INT (sext_hwi (ud3 << 16, 32)));
   if (ud2 != 0)
-   emit_move_insn (copy_rtx (temp),
-   gen_rtx_IOR (DImode, copy_rtx (temp),
-GEN_INT (ud2)));
-  emit_move_insn (ud1 != 0 ? copy_rtx (temp) : dest,
- gen_rtx_ASHIFT (DImode, copy_rtx (temp),
- GEN_INT (16)));
+   emit_move_insn (temp, gen_rtx_IOR (DImode, temp, GEN_INT (ud2)));
+  emit_move_insn (ud1 != 0 ? temp : dest,
+ gen_rtx_ASHIFT (DImode, temp, GEN_INT (16)));
   if (ud1 != 0)
-   emit_move_insn (dest,
-   gen_rtx_IOR (DImode, copy_rtx (temp),
-GEN_INT (ud1)));
+   emit_move_insn (dest, gen_rtx_IOR (DImode, temp, GEN_INT (ud1)));
 }
   else if (TARGET_PREFIXED)
 {
@@ -10334,23 +10322,17 @@ rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT c)
 {
   temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode);
 
-  emit_move_insn (copy_rtx (temp), GEN_INT (sext_hwi (ud4 << 16, 32)));
+  emit_move_insn (temp, GEN_INT (sext_hwi (ud4 << 16, 32)));
   if (ud3 != 0)
-   emit_move_insn (copy_rtx (temp),
-

[PATCH 2/3]rs6000: NFC use sext_hwi to replace ((v&0xf..f)^0x80..0) - 0x80..0

2022-11-30 Thread Jiufu Guo via Gcc-patches

Hi,

This patch just uses sext_hwi to replace the expression like:
((value & 0xf..f) ^ 0x80..0) - 0x80..0 for rs6000.cc and rs6000.md.

Bootstrap & regtest pass on ppc64{,le}.
Is this ok for trunk? 

BR,
Jeff (Jiufu)

gcc/ChangeLog:

* config/rs6000/rs6000.cc (num_insns_constant_gpr): Use sext_hwi.
(darwin_rs6000_legitimate_lo_sum_const_p): Likewise.
(mem_operand_gpr): Likewise.
(mem_operand_ds_form): Likewise.
(rs6000_legitimize_address): Likewise.
(rs6000_emit_set_const): Likewise.
(rs6000_emit_set_long_const): Likewise.
(print_operand): Likewise.
* config/rs6000/rs6000.md: Likewise.

---
 gcc/config/rs6000/rs6000.cc | 30 +-
 gcc/config/rs6000/rs6000.md | 10 +-
 2 files changed, 18 insertions(+), 22 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 5efe9b22d8b..718072cc9a1 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -6021,7 +6021,7 @@ num_insns_constant_gpr (HOST_WIDE_INT value)
 
   else if (TARGET_POWERPC64)
 {
-  HOST_WIDE_INT low  = ((value & 0x) ^ 0x8000) - 0x8000;
+  HOST_WIDE_INT low = sext_hwi (value, 32);
   HOST_WIDE_INT high = value >> 31;
 
   if (high == 0 || high == -1)
@@ -8456,7 +8456,7 @@ darwin_rs6000_legitimate_lo_sum_const_p (rtx x, 
machine_mode mode)
 }
 
   /* We only care if the access(es) would cause a change to the high part.  */
-  offset = ((offset & 0x) ^ 0x8000) - 0x8000;
+  offset = sext_hwi (offset, 16);
   return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);
 }
 
@@ -8522,7 +8522,7 @@ mem_operand_gpr (rtx op, machine_mode mode)
   if (GET_CODE (addr) == LO_SUM)
 /* For lo_sum addresses, we must allow any offset except one that
causes a wrap, so test only the low 16 bits.  */
-offset = ((offset & 0x) ^ 0x8000) - 0x8000;
+offset = sext_hwi (offset, 16);
 
   return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);
 }
@@ -8562,7 +8562,7 @@ mem_operand_ds_form (rtx op, machine_mode mode)
   if (GET_CODE (addr) == LO_SUM)
 /* For lo_sum addresses, we must allow any offset except one that
causes a wrap, so test only the low 16 bits.  */
-offset = ((offset & 0x) ^ 0x8000) - 0x8000;
+offset = sext_hwi (offset, 16);
 
   return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);
 }
@@ -9136,7 +9136,7 @@ rs6000_legitimize_address (rtx x, rtx oldx 
ATTRIBUTE_UNUSED,
 {
   HOST_WIDE_INT high_int, low_int;
   rtx sum;
-  low_int = ((INTVAL (XEXP (x, 1)) & 0x) ^ 0x8000) - 0x8000;
+  low_int = sext_hwi (INTVAL (XEXP (x, 1)), 16);
   if (low_int >= 0x8000 - extra)
low_int = 0;
   high_int = INTVAL (XEXP (x, 1)) - low_int;
@@ -10203,7 +10203,7 @@ rs6000_emit_set_const (rtx dest, rtx source)
  lo = operand_subword_force (dest, WORDS_BIG_ENDIAN != 0,
  DImode);
  emit_move_insn (hi, GEN_INT (c >> 32));
- c = ((c & 0x) ^ 0x8000) - 0x8000;
+ c = sext_hwi (c, 32);
  emit_move_insn (lo, GEN_INT (c));
}
   else
@@ -10242,7 +10242,7 @@ rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT c)
 
   if ((ud4 == 0x && ud3 == 0x && ud2 == 0x && (ud1 & 0x8000))
   || (ud4 == 0 && ud3 == 0 && ud2 == 0 && ! (ud1 & 0x8000)))
-emit_move_insn (dest, GEN_INT ((ud1 ^ 0x8000) - 0x8000));
+emit_move_insn (dest, GEN_INT (sext_hwi (ud1, 16)));
 
   else if ((ud4 == 0x && ud3 == 0x && (ud2 & 0x8000))
   || (ud4 == 0 && ud3 == 0 && ! (ud2 & 0x8000)))
@@ -10250,7 +10250,7 @@ rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT c)
   temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode);
 
   emit_move_insn (ud1 != 0 ? copy_rtx (temp) : dest,
- GEN_INT (((ud2 << 16) ^ 0x8000) - 0x8000));
+ GEN_INT (sext_hwi (ud2 << 16, 32)));
   if (ud1 != 0)
emit_move_insn (dest,
gen_rtx_IOR (DImode, copy_rtx (temp),
@@ -10261,8 +10261,7 @@ rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT c)
   temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode);
 
   gcc_assert (ud2 & 0x8000);
-  emit_move_insn (copy_rtx (temp),
- GEN_INT (((ud2 << 16) ^ 0x8000) - 0x8000));
+  emit_move_insn (copy_rtx (temp), GEN_INT (sext_hwi (ud2 << 16, 32)));
   if (ud1 != 0)
emit_move_insn (copy_rtx (temp),
gen_rtx_IOR (DImode, copy_rtx (temp),
@@ -10273,7 +10272,7 @@ rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT c)
 {
   temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode);
   HOST_WIDE_INT num = (ud2 << 16) | ud1;
-  rs6000_emit_set_long_const (temp, (num ^ 0x8000) - 0x8000);
+  rs6000_emit_set_long_const (temp, sext_hwi (num, 32));
   rtx one = gen_rtx_AND (DImode, temp, GEN_INT (0x)

[PATCH 1/3]rs6000: NFC use more readable pattern to clean high 32 bits

2022-11-30 Thread Jiufu Guo via Gcc-patches

Hi,

This patch is just using a more readable pattern for "rldicl x,x,0,32"
to clean high 32bits.
Old pattern looks like: r118:DI=zero_extend(r120:DI#0)
new pattern looks like: r118:DI=r120:DI&0x

Bootstrap and regtest pass on ppc64{,le}.
Is this ok for trunk?

BR,
Jeff (Jiufu)

gcc/ChangeLog:

* config/rs6000/rs6000.cc (rs6000_emit_set_long_const): Update
zero_extend(reg:DI#0) to reg:DI&0x.

---
 gcc/config/rs6000/rs6000.cc | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index eb7ad5e954f..5efe9b22d8b 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -10267,10 +10267,7 @@ rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT c)
emit_move_insn (copy_rtx (temp),
gen_rtx_IOR (DImode, copy_rtx (temp),
 GEN_INT (ud1)));
-  emit_move_insn (dest,
- gen_rtx_ZERO_EXTEND (DImode,
-  gen_lowpart (SImode,
-   copy_rtx (temp;
+  emit_move_insn (dest, gen_rtx_AND (DImode, temp, GEN_INT (0x)));
 }
   else if (ud1 == ud3 && ud2 == ud4)
 {
-- 
2.17.1

Re: [PATCH 2/2] Improve error message for excess elements in array initializer from {"a"}

2022-11-30 Thread Segher Boessenkool

On Thu, Dec 01, 2022 at 12:17:30AM +0100, Andreas Schwab wrote:
> On Nov 30 2022, Segher Boessenkool wrote:
> 
> > char u[1] = { "x", "x" }; /* { dg-error {excess elements in 'char[1]' 
> > initializer} } */
> 
> That won't work, as '[1]' is a bracket expression only matching '1'.
> You'll need {... 'char\[1\]' ...}.

Heh.  Yeah, char\[1].  Sorry.


Segher

Re: [PATCH] longlong.h: Do no use asm input cast for clang

2022-11-30 Thread Segher Boessenkool

Hi!

On Wed, Nov 30, 2022 at 03:16:25PM -0300, Adhemerval Zanella via Gcc-patches 
wrote:
> clang by default rejects the input casts with:
> 
>   error: invalid use of a cast in a inline asm context requiring an
>   lvalue: remove the cast or build with -fheinous-gnu-extensions
> 
> And even with -fheinous-gnu-extensions clang still throws an warning
> and also states that this option might be removed in the future.
> For gcc the cast are still useful somewhat [1], so just remove it
> clang is used.

This is one of the things in inline asm that is tightly tied to GCC
internals.  You should emulate GCC's behaviour faithfully if you want
to claim you implement the inline asm GNU C extension.

> --- a/include/ChangeLog
> +++ b/include/ChangeLog

That should not be part of the patch?  Changelog entries should be
verbatim in the message you send.

The size of this patch already makes clear this is a bad idea, imo.
This code is already hard enough to read.

Segher

Re: [PATCH 2/2] Improve error message for excess elements in array initializer from {"a"}

2022-11-30 Thread Andreas Schwab

On Nov 30 2022, Segher Boessenkool wrote:

> char u[1] = { "x", "x" }; /* { dg-error {excess elements in 'char[1]' 
> initializer} } */

That won't work, as '[1]' is a bracket expression only matching '1'.
You'll need {... 'char\[1\]' ...}.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

Re: [PATCH 2/2] Improve error message for excess elements in array initializer from {"a"}

2022-11-30 Thread Segher Boessenkool

Hi!

On Wed, Nov 30, 2022 at 09:18:15AM -0800, apinski--- via Gcc-patches wrote:
> Note in the testsuite I used regex . to match '[' and ']' as
> I could not figure out how many '\' I needed.

Don't use double quotes then :-)  Inside double quotes all of command
substitution, variable substitution, and backslash substitution are
performed.  In a regexp you typically want none of this.  You usually do
want the whitespace to be significant, so it is good to quote it in
braces though (unless you like quoting all your whitespace).

> -char u[1] = { "x", "x" }; /* { dg-error "excess elements in 'char' array 
> initializer" } */
> +char u[1] = { "x", "x" }; /* { dg-error "excess elements in 'char.1.' 
> initializer" } */

char u[1] = { "x", "x" }; /* { dg-error {excess elements in 'char[1]' 
initializer} } */

See  for a very short
page that has *all* Tcl syntax!

Segher

Re: [PATCH] RISC-V: optimize stack manipulation in save-restore

2022-11-30 Thread Palmer Dabbelt


On Wed, 30 Nov 2022 00:37:17 PST (-0800), gao...@eswincomputing.com wrote:

The stack that save-restore reserves is not well accumulated in stack 
allocation and deallocation.
This patch allows less instructions to be used in stack allocation and 
deallocation if save-restore enabled,
and also a much clear logic for save-restore stack manipulation.

before patch:
bar:
callt0,__riscv_save_4
addisp,sp,-64
...
li  t0,-12288
addit0,t0,-1968 # optimized out after patch
add sp,sp,t0 # prologue
...
li  t0,12288 # epilogue
addit0,t0,2000 # optimized out after patch
add sp,sp,t0
...
addisp,sp,32
tail__riscv_restore_4

after patch:
bar:
callt0,__riscv_save_4
addisp,sp,-2032
...
li  t0,-12288
add sp,sp,t0 # prologue
...
li  t0,12288 # epilogue
add sp,sp,t0
...
addisp,sp,2032
tail__riscv_restore_4

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_first_stack_step): add a new function 
parameter remaining_size.
(riscv_compute_frame_info): adapt new riscv_first_stack_step interface.
(riscv_expand_prologue): consider save-restore in stack allocation.
(riscv_expand_epilogue): consider save-restore in stack deallocation.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/stack_save_restore.c: New test.
---
 gcc/config/riscv/riscv.cc | 58 ++-
 .../gcc.target/riscv/stack_save_restore.c | 40 +
 2 files changed, 70 insertions(+), 28 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/stack_save_restore.c


I guess with the RISC-V backend still being open for things as big as 
the V port we should probably be taking code like this as well?  I 
wouldn't be opposed to making an exception for the V code and holding 
everything else back, though.



diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 05bdba5ab4d..9e92e729a5f 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -4634,7 +4634,7 @@ riscv_save_libcall_count (unsigned mask)
They decrease stack_pointer_rtx but leave frame_pointer_rtx and
hard_frame_pointer_rtx unchanged.  */

-static HOST_WIDE_INT riscv_first_stack_step (struct riscv_frame_info *frame);
+static HOST_WIDE_INT riscv_first_stack_step (struct riscv_frame_info *frame, 
poly_int64 remaining_size);

 /* Handle stack align for poly_int.  */
 static poly_int64
@@ -4663,7 +4663,7 @@ riscv_compute_frame_info (void)
  save/restore t0.  We check for this before clearing the frame struct.  */
   if (cfun->machine->interrupt_handler_p)
 {
-  HOST_WIDE_INT step1 = riscv_first_stack_step (frame);
+  HOST_WIDE_INT step1 = riscv_first_stack_step (frame, frame->total_size);
   if (! POLY_SMALL_OPERAND_P ((frame->total_size - step1)))
interrupt_save_prologue_temp = true;
 }
@@ -4913,31 +4913,31 @@ riscv_restore_reg (rtx reg, rtx mem)
without adding extra instructions.  */

 static HOST_WIDE_INT
-riscv_first_stack_step (struct riscv_frame_info *frame)
+riscv_first_stack_step (struct riscv_frame_info *frame, poly_int64 
remaining_size)
 {
-  HOST_WIDE_INT frame_total_constant_size;
-  if (!frame->total_size.is_constant ())
-frame_total_constant_size
-  = riscv_stack_align (frame->total_size.coeffs[0])
-   - riscv_stack_align (frame->total_size.coeffs[1]);
+  HOST_WIDE_INT remaining_const_size;
+  if (!remaining_size.is_constant ())
+remaining_const_size
+  = riscv_stack_align (remaining_size.coeffs[0])
+   - riscv_stack_align (remaining_size.coeffs[1]);


The alignment looks off here, at least in the email.  Worth fixing it up 
if you're touching the lines anyway.



   else
-frame_total_constant_size = frame->total_size.to_constant ();
+remaining_const_size = remaining_size.to_constant ();

-  if (SMALL_OPERAND (frame_total_constant_size))
-return frame_total_constant_size;
+  if (SMALL_OPERAND (remaining_const_size))
+return remaining_const_size;

   HOST_WIDE_INT min_first_step =
-RISCV_STACK_ALIGN ((frame->total_size - 
frame->frame_pointer_offset).to_constant());
+RISCV_STACK_ALIGN ((remaining_size - 
frame->frame_pointer_offset).to_constant());
   HOST_WIDE_INT max_first_step = IMM_REACH / 2 - PREFERRED_STACK_BOUNDARY / 8;
-  HOST_WIDE_INT min_second_step = frame_total_constant_size - max_first_step;
+  HOST_WIDE_INT min_second_step = remaining_const_size - max_first_step;
   gcc_assert (min_first_step <= max_first_step);

   /* As an optimization, use the least-significant bits of the total frame
  size, so that the second ad

Re: Ping: [PATCH] maintainer-scripts: Add gdc to update_web_docs_git

2022-11-30 Thread Iain Buclaw via Gcc-patches

Hi Gerald,

Excerpts from Gerald Pfeifer's message of November 29, 2022 9:21 pm:
> Hi Iain,
> 
> On Tue, 29 Nov 2022, Iain Buclaw via Gcc-patches wrote:
>> This looks obvious, however I don't know how things are generated for
>> the online documentation site in order to say this won't cause any
>> problems for whatever process is building these pages.
> 
>>> maintainer-scripts/ChangeLog:
>>> 
>>> * update_web_docs_git: Add gdc to MANUALS.
> 
> please go ahead and let me know when done. I'll see how I can help.
> 

Thanks, I've committed it - along with a bit of content I've been
working on since the temporary switch to Sphinx (the gdc pages of seem
to still be up https://gcc.gnu.org/onlinedocs/gdc/).

As far as I understand, there's also a corresponding wwwdocs change to
be done so that there's  a reference from the main onlinedocs page.
Will wait until docs have been confirmed rebuilt before submitting that.

Iain.

[committed] d: Add language reference section to documentation files.

2022-11-30 Thread Iain Buclaw via Gcc-patches

Hi,

This adds an initial body of documentation for the D front-end - other
than the existing documentation for command-line usage/the man page.

Documentation covers code generation choices specific to GNU D - what
attributes are supported, intrinsics, pragmas, predefined versions,
language extensions, missing features and deviations from spec.

More could be added or elaborated upon, such as what linkage do
different symbols get, mixed language programming with C and C++, the
anatomy of a TypeInfo and ModuleInfo object, and so on.  This is enough
as a first wave just to get it off the ground.

Tested with `make html', and committed to mainline.

Regards,
Iain

---
gcc/d/ChangeLog:

* Make-lang.in (D_TEXI_FILES): Add d/implement-d.texi.
* gdc.texi: Adjust introduction, include implement-d.texi.
* implement-d.texi: New file.
---
 gcc/d/Make-lang.in |1 +
 gcc/d/gdc.texi |   12 +-
 gcc/d/implement-d.texi | 2514 
 3 files changed, 2523 insertions(+), 4 deletions(-)
 create mode 100644 gcc/d/implement-d.texi

diff --git a/gcc/d/Make-lang.in b/gcc/d/Make-lang.in
index 144d5b88483..b5264613db0 100644
--- a/gcc/d/Make-lang.in
+++ b/gcc/d/Make-lang.in
@@ -239,6 +239,7 @@ d21$(exeext): $(D_ALL_OBJS) attribs.o $(BACKEND) $(LIBDEPS) 
$(d.prev)
 
 D_TEXI_FILES = \
d/gdc.texi \
+   d/implement-d.texi \
$(gcc_docdir)/include/fdl.texi \
$(gcc_docdir)/include/gpl_v3.texi \
$(gcc_docdir)/include/gcc-common.texi \
diff --git a/gcc/d/gdc.texi b/gcc/d/gdc.texi
index 45dc544e83f..c99c36558a9 100644
--- a/gcc/d/gdc.texi
+++ b/gcc/d/gdc.texi
@@ -65,13 +65,15 @@ Boston, MA 02110-1301, USA@*
 @top Introduction
 
 This manual describes how to use @command{gdc}, the GNU compiler for
-the D programming language.  This manual is specifically about
-@command{gdc}.  For more information about the D programming
-language in general, including language specifications and standard
-package documentation, see @uref{https://dlang.org/}.
+the D programming language.  This manual is specifically about how to
+invoke @command{gdc}, as well as its features and incompatibilities.
+For more information about the D programming language in general,
+including language specifications and standard package documentation,
+see @uref{https://dlang.org/}.
 
 @menu
 * Invoking gdc::How to run gdc.
+* D Implementation::User-visible implementation details.
 * Copying:: The GNU General Public License.
 * GNU Free Documentation License::
 How you can share and copy this manual.
@@ -838,6 +840,8 @@ and all @code{function} bodies that are being compiled.
 
 @c man end
 
+@include implement-d.texi
+
 @include gpl_v3.texi
 @include fdl.texi
 
diff --git a/gcc/d/implement-d.texi b/gcc/d/implement-d.texi
new file mode 100644
index 000..8f3f825e797
--- /dev/null
+++ b/gcc/d/implement-d.texi
@@ -0,0 +1,2514 @@
+@ignore
+Copyright (C) 2022 Free Software Foundation, Inc.
+This is part of the GNU D manual.
+For copying conditions, see the file gdc.texi.
+@end ignore
+
+@node D Implementation
+@chapter Language Reference
+@cindex language reference, D language
+
+The implementation of the D programming language used by the GNU D compiler is
+shared with parts of the front-end for the Digital Mars D compiler, hosted at
+@uref{https://github.com/dlang/dmd/}.  This common front-end covers lexical
+analysis, parsing, and semantic analysis of the D programming language defined
+in the documents at @uref{https://dlang.org/}.
+
+The implementation details described in this manual are GNU D extensions to the
+D programming language.  If you want to write code that checks whether these
+features are available, you can test for the predefined version @code{GNU}, or
+you can check whether a specific feature is compilable using
+@code{__traits(compiles)}.
+
+@smallexample
+version (GNU)
+@{
+import gcc.builtins;
+return __builtin_atan2(x, y);
+@}
+
+static if (__traits(compiles, @{ asm @{"";@} @}))
+@{
+asm @{ "magic instruction"; @}
+@}
+@end smallexample
+
+@menu
+* Attributes::  Implementation-defined attributes.
+* Builtin Functions::   GCC built-ins module.
+* ImportC:: Importing C sources into D.
+* Inline Assembly:: Interfacing D with assembler.
+* Intrinsics::  Intrinsic functions supported by GDC.
+* Predefined Pragmas::  Pragmas accepted by GDC.
+* Predefined Versions:: List of versions for conditional compilation.
+* Special Enums::   Intrinsic type interoperability with C and C++.
+* Traits::  Compile-time reflection extensions.
+* Vector Extensions::   Using vector types and supported operations.
+* Vector Intrinsics::   Vector instructions through intrinsics.
+* Missing Features::Deviations from the D2 specification in GDC.
+@end menu
+
+
+@c 
+
+@node

[committed] d: Update recipes for building html and pdf documentation

2022-11-30 Thread Iain Buclaw via Gcc-patches

Hi,

This patch sorts out the include directories for building the gdc docs -
we don't need to include anything from the toplevel docs directory.

The html output directory has also been renamed from /d/ to /gdc/ to
make it clearer that this is vendor-specific documentation.

Tested by building and checking pdf/info/man/html pages, and committed
to mainline.

Regards,
Iain.

---
gcc/d/ChangeLog:

* Make-lang.in: Only include doc/include when building documentation.
(d.html): Rename html directory to $(build_htmldir)/gdc.
---
 gcc/d/Make-lang.in | 15 +++
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/gcc/d/Make-lang.in b/gcc/d/Make-lang.in
index 28313208ec9..b5264613db0 100644
--- a/gcc/d/Make-lang.in
+++ b/gcc/d/Make-lang.in
@@ -253,16 +253,15 @@ doc/gdc.info: $(D_TEXI_FILES)
else true; fi
 
 doc/gdc.dvi: $(D_TEXI_FILES)
-   $(TEXI2DVI) -I $(abs_docdir) -I $(abs_docdir)/include -o $@ $<
+   $(TEXI2DVI) -I $(abs_docdir)/include -o $@ $<
 
 doc/gdc.pdf: $(D_TEXI_FILES)
-   $(TEXI2PDF) -I $(abs_docdir) -I $(abs_docdir)/include -o $@ $<
+   $(TEXI2PDF) -I $(abs_docdir)/include -o $@ $<
 
-$(build_htmldir)/d/index.html: $(D_TEXI_FILES)
+$(build_htmldir)/gdc/index.html: $(D_TEXI_FILES)
$(mkinstalldirs) $(@D)
rm -f $(@D)/*
-   $(TEXI2HTML) -I $(gcc_docdir) -I $(gcc_docdir)/include \
-   -I $(srcdir)/d -o $(@D) $<
+   $(TEXI2HTML) -I $(gcc_docdir)/include -I $(srcdir)/d -o $(@D) $<
 
 .INTERMEDIATE: gdc.pod
 
@@ -277,7 +276,7 @@ d.rest.encap:
 d.info: doc/gdc.info
 d.dvi: doc/gdc.dvi
 d.pdf: doc/gdc.pdf
-d.html: $(build_htmldir)/d/index.html
+d.html: $(build_htmldir)/gdc/index.html
 d.srcinfo: doc/gdc.info
-cp -p $^ $(srcdir)/doc
 d.srcextra:
@@ -341,10 +340,10 @@ d.install-dvi: doc/gdc.dvi
  $(INSTALL_DATA) "$$d$$p" "$(DESTDIR)$(dvidir)/gcc/$$f"; \
done
 
-d.install-html: $(build_htmldir)/d
+d.install-html: $(build_htmldir)/gdc
@$(NORMAL_INSTALL)
test -z "$(htmldir)" || $(mkinstalldirs) "$(DESTDIR)$(htmldir)"
-   @for p in $(build_htmldir)/d; do \
+   @for p in $(build_htmldir)/gdc; do \
  if test -f "$$p" || test -d "$$p"; then d=""; else d="$(srcdir)/"; 
fi; \
  f=$(html__strip_dir) \
  if test -d "$$d$$p"; then \
-- 
2.37.2

[committed] d: Separate documentation indices into options and keywords.

2022-11-30 Thread Iain Buclaw via Gcc-patches

Hi,

This patch separates indexes at the end of the gdc documentation into an
Options index and Keyword index, as per documentation manuals in other
front-ends.

Tested by building and checking pdf/info/man/html pages, and committed
to mainline.

Regards,
Iain.

---
gcc/d/ChangeLog:

* gdc.texi: Separate indices into options and keywords.
---
 gcc/d/gdc.texi | 225 ++---
 1 file changed, 118 insertions(+), 107 deletions(-)

diff --git a/gcc/d/gdc.texi b/gcc/d/gdc.texi
index 6ceb2cc67aa..45dc544e83f 100644
--- a/gcc/d/gdc.texi
+++ b/gcc/d/gdc.texi
@@ -2,6 +2,8 @@
 @setfilename gdc.info
 @settitle The GNU D Compiler
 
+@c Create a separate index for command line options
+@defcodeindex op
 @c Merge the standard indexes into a single one.
 @syncodeindex fn cp
 @syncodeindex vr cp
@@ -69,19 +71,14 @@ language in general, including language specifications and 
standard
 package documentation, see @uref{https://dlang.org/}.
 
 @menu
+* Invoking gdc::How to run gdc.
 * Copying:: The GNU General Public License.
 * GNU Free Documentation License::
 How you can share and copy this manual.
-* Invoking gdc::How to run gdc.
-* Index::   Index.
+* Option Index::Index of command line options.
+* Keyword Index::   Index of concepts.
 @end menu
 
-
-@include gpl_v3.texi
-
-@include fdl.texi
-
-
 @node Invoking gdc
 @chapter Invoking gdc
 
@@ -173,21 +170,21 @@ These options affect the runtime behavior of programs 
compiled with
 @table @gcctabopt
 
 @item -fall-instantiations
-@cindex @option{-fall-instantiations}
-@cindex @option{-fno-all-instantiations}
+@opindex fall-instantiations
+@opindex fno-all-instantiations
 Generate code for all template instantiations.  The default template emission
 strategy is to not generate code for declarations that were either
 instantiated speculatively, such as from @code{__traits(compiles, ...)}, or
 that come from an imported module not being compiled.
 
 @item -fno-assert
-@cindex @option{-fassert}
-@cindex @option{-fno-assert}
+@opindex fassert
+@opindex fno-assert
 Turn off code generation for @code{assert} contracts.
 
 @item -fno-bounds-check
-@cindex @option{-fbounds-check}
-@cindex @option{-fno-bounds-check}
+@opindex fbounds-check
+@opindex fno-bounds-check
 Turns off array bounds checking for all functions, which can improve
 performance for code that uses arrays extensively.  Note that this
 can result in unpredictable behavior if the code in question actually
@@ -195,7 +192,7 @@ does violate array bounds constraints.  It is safe to use 
this option
 if you are sure that your code never throws a @code{RangeError}.
 
 @item -fbounds-check=@var{value}
-@cindex @option{-fbounds-check=}
+@opindex fbounds-check=
 An alternative to @option{-fbounds-check} that allows more control
 as to where bounds checking is turned on or off.  The following values
 are supported:
@@ -210,14 +207,14 @@ Turns off array bounds checking completely.
 @end table
 
 @item -fno-builtin
-@cindex @option{-fbuiltin}
-@cindex @option{-fno-builtin}
+@opindex fbuiltin
+@opindex fno-builtin
 Don't recognize built-in functions unless they begin with the prefix
 @samp{__builtin_}.  By default, the compiler will recognize when a
 function in the @code{core.stdc} package is a built-in function.
 
 @item -fcheckaction=@var{value}
-@cindex @option{-fcheckaction}
+@opindex fcheckaction
 This option controls what code is generated on an assertion, bounds check, or
 final switch failure.  The following values are supported:
 
@@ -232,8 +229,8 @@ Throw an @code{AssertError} (the default).
 
 @item -fdebug
 @item -fdebug=@var{value}
-@cindex @option{-fdebug}
-@cindex @option{-fno-debug}
+@opindex fdebug
+@opindex fno-debug
 Turn on compilation of conditional @code{debug} code into the program.
 The @option{-fdebug} option itself sets the debug level to @code{1},
 while @option{-fdebug=} enables @code{debug} code that are identified
@@ -245,8 +242,8 @@ Turns on compilation of any @code{debug} code identified by 
@var{ident}.
 @end table
 
 @item -fno-druntime
-@cindex @option{-fdruntime}
-@cindex @option{-fno-druntime}
+@opindex fdruntime
+@opindex fno-druntime
 Implements @uref{https://dlang.org/spec/betterc.html}.  Assumes that
 compilation targets an environment without a D runtime library.
 
@@ -257,7 +254,7 @@ gdc -nophoboslib -fno-exceptions -fno-moduleinfo -fno-rtti
 @end example
 
 @item -fextern-std=@var{standard}
-@cindex @option{-fextern-std}
+@opindex fextern-std
 Sets the C++ name mangling compatibility to the version identified by
 @var{standard}.  The following values are supported:
 
@@ -277,20 +274,20 @@ Sets @code{__traits(getTargetInfo, "cppStd")} to 
@code{202002}.
 @end table
 
 @item -fno-invariants
-@cindex @option{-finvariants}
-@cindex @option{-fno-invariants}
+@opindex finvariants
+@opindex fno-invariants
 Turns off

[committed] d: Synchronize gdc documentation with options in d/lang.opt

2022-11-30 Thread Iain Buclaw via Gcc-patches

Hi,

This patch synchronizes the documentation between lang.opt and gdc.texi.

Tested by building and checking pdf/info/man/html pages, and committed
to mainline.

Regards,
Iain.

---
gcc/d/ChangeLog:

* gdc.texi: Update gdc option documentation.
* lang.opt (frevert=intpromote): Correct documentation.
---
 gcc/d/gdc.texi | 38 +-
 gcc/d/lang.opt |  2 +-
 2 files changed, 18 insertions(+), 22 deletions(-)

diff --git a/gcc/d/gdc.texi b/gcc/d/gdc.texi
index d3bf75ccfa9..6ceb2cc67aa 100644
--- a/gcc/d/gdc.texi
+++ b/gcc/d/gdc.texi
@@ -240,9 +240,6 @@ while @option{-fdebug=} enables @code{debug} code that are 
identified
 by any of the following values:
 
 @table @samp
-@item level
-Sets the debug level to @var{level}, any @code{debug} code <= @var{level}
-is compiled into the program.
 @item ident
 Turns on compilation of any @code{debug} code identified by @var{ident}.
 @end table
@@ -325,6 +322,8 @@ values are supported:
 @table @samp
 @item all
 Turns on all upcoming D language features.
+@item bitfields
+Implements bit-fields in D.
 @item dip1000
 Implements 
@uref{https://github.com/dlang/DIPs/blob/master/DIPs/other/DIP1000.md}
 (Scoped pointers).
@@ -353,9 +352,6 @@ rvalues.
 @item inclusiveincontracts
 Implements @code{in} contracts of overridden methods to be a superset of parent
 contract.
-@item intpromote
-Implements C-style integral promotion for unary @code{+}, @code{-} and @code{~}
-expressions.
 @item nosharedaccess
 Turns off and disallows all access to shared memory objects.
 @item rvaluerefparam
@@ -387,13 +383,17 @@ are supported:
 @table @samp
 @item all
 Turns off all revertable D language features.
+@item dip1000
+Reverts @uref{https://github.com/dlang/DIPs/blob/master/DIPs/other/DIP1000.md}
+(Scoped pointers).
 @item dip25
 Reverts @uref{https://github.com/dlang/DIPs/blob/master/DIPs/archive/DIP25.md}
 (Sealed references).
 @item dtorfields
 Turns off generation for destructing fields of partially constructed objects.
-@item markdown
-Turns off Markdown replacements in Ddoc comments.
+@item intpromote
+Turns off C-style integral promotion for unary @code{+}, @code{-} and @code{~}
+expressions.
 @end table
 
 @item -fno-rtti
@@ -423,9 +423,6 @@ Turns on compilation of conditional @code{version} code 
into the program
 identified by any of the following values:
 
 @table @samp
-@item level
-Sets the version level to @var{level}, any @code{version} code >= @var{level}
-is compiled into the program.
 @item ident
 Turns on compilation of @code{version} code identified by @var{ident}.
 @end table
@@ -646,8 +643,10 @@ and provides source for debuggers to show when requested.
 
 @node Warnings
 @section Warnings
-@cindex options to control warnings
-@cindex warning messages
+@cindex options, warnings
+@cindex options, errors
+@cindex warnings, suppressing
+@cindex messages, error
 @cindex messages, warning
 @cindex suppressing warnings
 
@@ -678,6 +677,11 @@ whose bound can be larger than @var{n} bytes.
 @option{-Walloca-larger-than} warning and is equivalent to
 @option{-Walloca-larger-than=@var{SIZE_MAX}} or larger.
 
+@item -Wno-builtin-declaration-mismatch
+@cindex @option{-Wno-builtin-declaration-mismatch}
+@cindex @option{-Wbuiltin-declaration-mismatch}
+Warn if a built-in function is declared with an incompatible signature.
+
 @item -Wcast-result
 @cindex @option{-Wcast-result}
 @cindex @option{-Wno-cast-result}
@@ -704,12 +708,6 @@ List all error messages from speculative compiles, such as
 messages as warnings, and these messages therefore never become
 errors when the @option{-Werror} option is also used.
 
-@item -Wtemplates
-@cindex @option{-Wtemplates}
-@cindex @option{-Wno-templates}
-Warn when a template instantiation is encountered.  Some coding
-rules disallow templates, and this may be used to enforce that rule.
-
 @item -Wunknown-pragmas
 @cindex @option{-Wunknown-pragmas}
 @cindex @option{-Wno-unknown-pragmas}
@@ -764,8 +762,6 @@ List all hidden GC allocations.
 List statistics on template instantiations.
 @item tls
 List all variables going into thread local storage.
-@item vmarkdown
-List instances of Markdown replacements in Ddoc.
 @end table
 
 @end table
diff --git a/gcc/d/lang.opt b/gcc/d/lang.opt
index 15ab725a2dd..b039c766aa9 100644
--- a/gcc/d/lang.opt
+++ b/gcc/d/lang.opt
@@ -422,7 +422,7 @@ Don't destruct fields of partially constructed objects.
 
 frevert=intpromote
 D RejectNegative
-Use C-style integral promotion for unary '+', '-' and '~'.
+Don't use C-style integral promotion for unary '+', '-' and '~'.
 
 frtti
 D
-- 
2.37.2

[GCC-12][committed] d: Fix #error You must define PREFERRED_DEBUGGING_TYPE if DWARF is not supported

2022-11-30 Thread Iain Buclaw via Gcc-patches

Hi,

This patch was applied to mainline back in August, but was held back
from backporting until after 12.2 release to allow some more time for
testing.  There are no further regressions been found, so have
backported to the releases/gcc-12 branch.

Bootstrapped and regression tested on x86_64-linux-gnu/-m32/-mx32, and
committed.

Regards,
Iain.

---
PR d/105659

gcc/ChangeLog:

* config.gcc: Set tm_d_file to ${cpu_type}/${cpu_type}-d.h.
* config/aarch64/aarch64-d.cc: Include tm_d.h.
* config/aarch64/aarch64-protos.h (aarch64_d_target_versions): Move to
config/aarch64/aarch64-d.h.
(aarch64_d_register_target_info): Likewise.
* config/aarch64/aarch64.h (TARGET_D_CPU_VERSIONS): Likewise.
(TARGET_D_REGISTER_CPU_TARGET_INFO): Likewise.
* config/arm/arm-d.cc: Include tm_d.h and arm-protos.h instead of
tm_p.h.
* config/arm/arm-protos.h (arm_d_target_versions): Move to
config/arm/arm-d.h.
(arm_d_register_target_info): Likewise.
* config/arm/arm.h (TARGET_D_CPU_VERSIONS): Likewise.
(TARGET_D_REGISTER_CPU_TARGET_INFO): Likewise.
* config/default-d.cc: Remove memmodel.h include.
* config/freebsd-d.cc: Include tm_d.h instead of tm_p.h.
* config/glibc-d.cc: Likewise.
* config/i386/i386-d.cc: Include tm_d.h.
* config/i386/i386-protos.h (ix86_d_target_versions): Move to
config/i386/i386-d.h.
(ix86_d_register_target_info): Likewise.
(ix86_d_has_stdcall_convention): Likewise.
* config/i386/i386.h (TARGET_D_CPU_VERSIONS): Likewise.
(TARGET_D_REGISTER_CPU_TARGET_INFO): Likewise.
(TARGET_D_HAS_STDCALL_CONVENTION): Likewise.
* config/i386/winnt-d.cc: Include tm_d.h instead of tm_p.h.
* config/mips/mips-d.cc: Include tm_d.h.
* config/mips/mips-protos.h (mips_d_target_versions): Move to
config/mips/mips-d.h.
(mips_d_register_target_info): Likewise.
* config/mips/mips.h (TARGET_D_CPU_VERSIONS): Likewise.
(TARGET_D_REGISTER_CPU_TARGET_INFO): Likewise.
* config/netbsd-d.cc: Include tm_d.h instead of tm.h and memmodel.h.
* config/openbsd-d.cc: Likewise.
* config/pa/pa-d.cc: Include tm_d.h.
* config/pa/pa-protos.h (pa_d_target_versions): Move to
config/pa/pa-d.h.
(pa_d_register_target_info): Likewise.
* config/pa/pa.h (TARGET_D_CPU_VERSIONS): Likewise.
(TARGET_D_REGISTER_CPU_TARGET_INFO): Likewise.
* config/riscv/riscv-d.cc: Include tm_d.h.
* config/riscv/riscv-protos.h (riscv_d_target_versions): Move to
config/riscv/riscv-d.h.
(riscv_d_register_target_info): Likewise.
* config/riscv/riscv.h (TARGET_D_CPU_VERSIONS): Likewise.
(TARGET_D_REGISTER_CPU_TARGET_INFO): Likewise.
* config/rs6000/rs6000-d.cc: Include tm_d.h.
* config/rs6000/rs6000-protos.h (rs6000_d_target_versions): Move to
config/rs6000/rs6000-d.h.
(rs6000_d_register_target_info): Likewise.
* config/rs6000/rs6000.h (TARGET_D_CPU_VERSIONS) Likewise.:
(TARGET_D_REGISTER_CPU_TARGET_INFO) Likewise.:
* config/s390/s390-d.cc: Include tm_d.h.
* config/s390/s390-protos.h (s390_d_target_versions): Move to
config/s390/s390-d.h.
(s390_d_register_target_info): Likewise.
* config/s390/s390.h (TARGET_D_CPU_VERSIONS): Likewise.
(TARGET_D_REGISTER_CPU_TARGET_INFO): Likewise.
* config/sol2-d.cc: Include tm_d.h instead of tm.h and memmodel.h.
* config/sparc/sparc-d.cc: Include tm_d.h.
* config/sparc/sparc-protos.h (sparc_d_target_versions): Move to
config/sparc/sparc-d.h.
(sparc_d_register_target_info): Likewise.
* config/sparc/sparc.h (TARGET_D_CPU_VERSIONS): Likewise.
(TARGET_D_REGISTER_CPU_TARGET_INFO): Likewise.
* configure: Regenerate.
* configure.ac (tm_d_file): Remove defaults.h.
(tm_d_include_list): Remove options.h and insn-constants.h.
* config/aarch64/aarch64-d.h: New file.
* config/arm/arm-d.h: New file.
* config/i386/i386-d.h: New file.
* config/mips/mips-d.h: New file.
* config/pa/pa-d.h: New file.
* config/riscv/riscv-d.h: New file.
* config/rs6000/rs6000-d.h: New file.
* config/s390/s390-d.h: New file.
* config/sparc/sparc-d.h: New file.

(cherry picked from commit d5ad6f8415171798adaff5787400505ce9882144)
---
 gcc/config.gcc  | 10 ++
 gcc/config/aarch64/aarch64-d.cc |  1 +
 gcc/config/aarch64/aarch64-d.h  | 24 
 gcc/config/aarch64/aarch64-protos.h |  4 
 gcc/config/aarch64/aarch64.h|  4 
 gcc/config/arm/arm-d.cc |  3 ++-
 gcc/config/arm/arm-d.h  | 24 
 gcc/config/arm/arm-protos.h |  4 
 gcc/config/arm/arm.h|  4 --

Re: [PATCH v2] libgo: Don't rely on GNU-specific strerror_r variant on Linux

2022-11-30 Thread Ian Lance Taylor via Gcc-patches

On Tue, Nov 29, 2022 at 4:10 PM Ian Lance Taylor  wrote:
>
> On Tue, Nov 29, 2022 at 9:54 AM  wrote:
> >
> > From: Sören Tempel 
> >
> > On glibc, there are two versions of strerror_r: An XSI-compliant and a
> > GNU-specific version. The latter is only available on glibc. In order
> > to avoid duplicating the post-processing code of error messages, this
> > commit provides a separate strerror_go symbol which always refers to the
> > XSI-compliant version of strerror_r (even on glibc) by selectively
> > undefining the corresponding feature test macro.
> >
> > Previously, gofrontend assumed that the GNU-specific version of
> > strerror_r was always available on Linux (which isn't the case when
> > using a musl as a libc, for example). This commit thereby improves
> > compatibility with Linux systems that are not using glibc.
> >
> > Tested on x86_64 Alpine Linux Edge and Arch Linux (glibc 2.36).
>
> Thanks.  I committed a version of this, as attached.

I've committed this follow-on patch for Hurd.

Ian
91607eba8fe49c064192122ec60a3e03dd8f2515
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index 984d8324004..a26f779557d 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-fef6aa3c1678cdbe7dca454b2cebb369d8ba81bf
+1c5bfd57131b68b91d8400bb017f35d416f7aa7b
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/libgo/runtime/go-strerror.c b/libgo/runtime/go-strerror.c
index 13d1d91df84..8ff5ffbdfec 100644
--- a/libgo/runtime/go-strerror.c
+++ b/libgo/runtime/go-strerror.c
@@ -12,7 +12,7 @@
exists to selectively undefine it and provides an alias to the
XSI-compliant version of strerror_r(3).  */
 
-#ifdef __linux__
+#if defined(__linux__) || defined(__gnu_hurd__)
 
 /* Force selection of XSI-compliant strerror_r by glibc.  */
 #undef XOPEN_SOURCE
@@ -21,7 +21,7 @@
 #define _POSIX_C_SOURCE 200112L
 #undef _GNU_SOURCE
 
-#endif /* __linux__ */
+#endif /* defined(__linux__) || defined(__gnu_hurd__) */
 
 #include

Re: [PATCH] libgccjit: Fix float vector comparison

2022-11-30 Thread Antoni Boucher via Gcc-patches

David: PING

On Sun, 2022-11-20 at 14:03 -0500, Antoni Boucher wrote:
> Hi.
> This fixes bug 107770.
> Thanks for the review.

Re: [PATCH] libstdc++: Add error handler for

2022-11-30 Thread Björn Schäpers

One could (for a manual test) always change libbacktrace to call the callback. 
Or invoke it on a platform where libbacktrace can't figure out the executable 
path on its own, like currently windows.


As for an automated test, I have no idea how to enforce that, without changing 
the code to be tested.


Björn.

Am 30.11.2022 um 07:04 schrieb François Dumont:

Good catch, then we also need this patch.

I still need to test it thought, just to make sure it compiles. Unless you have 
a nice way to force call to the error callback ?


François

On 29/11/22 22:41, Björn Schäpers wrote:

From: Björn Schäpers 

Not providing an error handler results in a nullpointer dereference when
an error occurs.

libstdc++-v3/ChangeLog

* include/std/stacktrace: Add __backtrace_error_handler and use
it in all calls to libbacktrace.
---
  libstdc++-v3/include/std/stacktrace | 21 ++---
  1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/libstdc++-v3/include/std/stacktrace 
b/libstdc++-v3/include/std/stacktrace

index e7cbbee5638..b786441cbad 100644
--- a/libstdc++-v3/include/std/stacktrace
+++ b/libstdc++-v3/include/std/stacktrace
@@ -85,6 +85,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  #define __cpp_lib_stacktrace 202011L
+  inline void
+  __backtrace_error_handler(void*, const char*, int) {}
+
    // [stacktrace.entry], class stacktrace_entry
    class stacktrace_entry
    {
@@ -159,7 +162,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  _S_init()
  {
    static __glibcxx_backtrace_state* __state
-    = __glibcxx_backtrace_create_state(nullptr, 1, nullptr, nullptr);
+    = __glibcxx_backtrace_create_state(nullptr, 1,
+   __backtrace_error_handler, nullptr);
    return __state;
  }
@@ -192,7 +196,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
    return __function != nullptr;
    };
    const auto __state = _S_init();
-  if (::__glibcxx_backtrace_pcinfo(__state, _M_pc, +__cb, nullptr, 
&__data))
+  if (::__glibcxx_backtrace_pcinfo(__state, _M_pc, +__cb,
+   __backtrace_error_handler, &__data))
  return true;
    if (__desc && __desc->empty())
  {
@@ -201,8 +206,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
    if (__symname)
  *static_cast<_Data*>(__data)->_M_desc = _S_demangle(__symname);
    };
-  if (::__glibcxx_backtrace_syminfo(__state, _M_pc, +__cb2, nullptr,
-    &__data))
+  if (::__glibcxx_backtrace_syminfo(__state, _M_pc, +__cb2,
+    __backtrace_error_handler, &__data))
  return true;
  }
    return false;
@@ -252,7 +257,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  if (auto __cb = __ret._M_prepare()) [[likely]]
    {
  auto __state = stacktrace_entry::_S_init();
-    if (__glibcxx_backtrace_simple(__state, 1, __cb, nullptr,
+    if (__glibcxx_backtrace_simple(__state, 1, __cb,
+   __backtrace_error_handler,
 std::__addressof(__ret)))
    __ret._M_clear();
    }
@@ -270,7 +276,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  if (auto __cb = __ret._M_prepare()) [[likely]]
    {
  auto __state = stacktrace_entry::_S_init();
-    if (__glibcxx_backtrace_simple(__state, __skip + 1, __cb, nullptr,
+    if (__glibcxx_backtrace_simple(__state, __skip + 1, __cb,
+   __backtrace_error_handler,
 std::__addressof(__ret)))
    __ret._M_clear();
    }
@@ -294,7 +301,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
    {
  auto __state = stacktrace_entry::_S_init();
  int __err = __glibcxx_backtrace_simple(__state, __skip + 1, __cb,
-   nullptr,
+   __backtrace_error_handler,
 std::__addressof(__ret));
  if (__err < 0)
    __ret._M_clear();

Re: [PATCH] coroutines: Fix promotion of class members in co_await statements [PR99576]

2022-11-30 Thread Jason Merrill via Gcc-patches


On 11/30/22 03:51, Iain Sandoe wrote:

Hi Adrian,


On 28 Nov 2022, at 20:44, Iain Sandoe  wrote:



Bootstrapping and running the testsuite on x86_64 was successfull. No 
regression occured.


This looks resonable to me, as said in the PR.  I’d like to test a little wider 
with some larger
codebases, if you could bear with me for a few days.


So wider testing (in this case folly) reveals that, although the analysis seems 
reasonable, this is not quite the right patch to fix the issue.  It can be that 
CONSTRUCTORS contain nested await expressions, so we cannot simply punt on 
seeing one.

My hunch is that the real solution lies in (correctly) deciding whether to 
promote the temporary or not.  Jason recently made a change that identifies 
whether a target expression is expected to be elided (i.e. it is a direct 
intializer for another object) - I think this might help in this case.  My 
concern is whether I should read “expected to be elided” to be a guarantee 
(“expected” to me could also be read “but it might not be”).


You should be able to rely on that flag.  I believe all TARGET_EXPRs 
with TARGET_EXPR_ELIDING_P set are indeed elided.  As an optimization, 
occasionally TARGET_EXPRs without the flag are elided anyway, but it's 
still safe to promote them; the optimization just won't happen then.


Jason

Re: [PATCH] c++: Incremental fix for g++.dg/gomp/for-21.C [PR84469]

2022-11-30 Thread Jason Merrill via Gcc-patches


On 11/30/22 10:51, Jakub Jelinek wrote:

On Tue, Nov 29, 2022 at 11:05:33PM +0100, Jakub Jelinek wrote:

On Tue, Nov 29, 2022 at 04:38:50PM -0500, Jason Merrill wrote:

--- gcc/testsuite/g++.dg/gomp/for-21.C.jj   2020-01-12 11:54:37.178401867 
+0100
+++ gcc/testsuite/g++.dg/gomp/for-21.C  2022-11-29 13:06:59.038410557 +0100
@@ -54,9 +54,9 @@ void
   f6 (S (&a)[10])
   {
 #pragma omp for collapse (2)
-  for (auto [i, j, k] : a) // { dg-error "use of 'i' before deduction of 
'auto'" "" { target *-*-* } .-1 }
+  for (auto [i, j, k] : a) // { dg-error "use of 'i' before 
deduction of 'auto'" }
   for (int l = i; l < j; l += k)// { dg-error "use of 'j' 
before deduction of 'auto'" }
-  ;// { dg-error "use of 'k' before 
deduction of 'auto'" "" { target *-*-* } .-3 }
+  ;// { dg-error "use of 'k' before 
deduction of 'auto'" "" { target *-*-* } .-1 }


Hmm, this error is surprising: since the initializer is non-dependent, we
should have deduced immediately.  I'd expect the same error as in the
non-structured-binding cases, "* expression refers to iteration variable".


The reason was just to be consistent what is (unfortunately) emitted
in the other cases (!processing_template_decl or type dependent).
I guess I could try how much work would it be to deduce it sooner, but
generally it is pretty corner case, people rarely do this in OpenMP code.


I had a look at that today, but it would be pretty hard.  The thing is
we must emit all the associated code for all the range for loops in
OpenMP loops at a different spot.  So, the only possibility I see would
be if we during parsing of a range for loop inside of the OpenMP loop nest
we don't do the cp_finish_omp_range_for stuff to avoid e.g. cp_finish_decl,
but instead
   build_x_indirect_ref (input_location, begin, RO_UNARY_STAR,
NULL_TREE, tf_none)
and if that gives a non-dependent type, temporarily overwrite TREE_TYPE
of the decl and if it is structured binding, temporarily
++processing_template_decl and cp_finish_decomp, then after parsing all the
associated loop headers we revert that (and ditto for instantiation of
OpenMP loops).


It looks like we're already deducing the type for the underlying S 
variable in cp_convert_omp_range_for, we just aren't updating the types 
of the individual bindings.


Jason

Re: [PATCH][OG12] amdgcn: Support AMD-specific 'isa' and 'arch' traits in OpenMP context selectors

2022-11-30 Thread Kwok Cheung Yeung via Gcc-patches


Hello PA,


--- libgomp/config/gcn/selector.c
+++ libgomp/config/gcn/selector.c
@@ -36,7 +36,7 @@ GOMP_evaluate_current_device (const char *kind, const char 
*arch,
   if (kind && strcmp (kind, "gpu") != 0)
 return false;
 
-  if (arch && strcmp (arch, "gcn") != 0)

+  if (arch && (strcmp (arch, "gcn") != 0 || strcmp (arch, "amdgcn") != 0))
 return false;


The logic here looks wrong to me - surely it should return false if arch 
is not 'gcn' AND it is not 'amdgcn'?



@@ -48,8 +48,17 @@ GOMP_evaluate_current_device (const char *kind, const char 
*arch,
 #endif
 
 #ifdef __GCN5__

-  if (strcmp (isa, "gfx900") == 0 || strcmp (isa, "gfx906") != 0
-  || strcmp (isa, "gfx908") == 0)
+  if (strcmp (isa, "gfx900") == 0 || strcmp (isa, "gfx906") != 0)
+return true;
+#endif
+
+#ifdef __CDNA1__
+  if (strcmp (isa, "gfx908") == 0)
+return true;
+#endif
+
+#ifdef __CDNA2__
+  if (strcmp (isa, "gfx90a") == 0)
 return true;
 #endif


Okay for gfx908 and gfx90a, but is there any way of distinguishing 
between 'gfx900' and 'gfx906' ISAs? I don't think these are mutually 
compatible.


Thanks

Kwok

[committed] hppa: Fix addvdi3 and subvdi3 patterns

2022-11-30 Thread John David Anglin

This was found building 64-bit openssh package.

Committed to active branches.

Dave
---

Fix addvdi3 and subvdi3 patterns

While most PA 2.0 instructions support both 32 and 64-bit traps
and conditions, the addi and subi instructions only support 32-bit
traps and conditions. Thus, we need to force immediate operands
to register operands on the 64-bit target and use the add/sub
instructions which can trap on 64-bit signed overflow.

2022-11-30  John David Anglin  

gcc/ChangeLog:

* config/pa/pa.md (addvdi3): Force operand 2 to a register.
Remove "addi,tsv,*" instruction from unamed pattern.
(subvdi3): Force operand 1 to a register.
Remove "subi,tsv" instruction from from unamed pattern.

diff --git a/gcc/config/pa/pa.md b/gcc/config/pa/pa.md
index 76ae35d4cfa..41382271e54 100644
--- a/gcc/config/pa/pa.md
+++ b/gcc/config/pa/pa.md
@@ -5071,23 +5071,25 @@
(match_dup 2
   (const_int 0))])]
   ""
-  "")
+  "
+{
+  if (TARGET_64BIT)
+operands[2] = force_reg (DImode, operands[2]);
+}")
 
 (define_insn ""
-  [(set (match_operand:DI 0 "register_operand" "=r,r")
-   (plus:DI (match_operand:DI 1 "reg_or_0_operand" "%rM,rM")
-(match_operand:DI 2 "arith11_operand" "r,I")))
+  [(set (match_operand:DI 0 "register_operand" "=r")
+   (plus:DI (match_operand:DI 1 "reg_or_0_operand" "%rM")
+(match_operand:DI 2 "register_operand" "r")))
(trap_if (ne (plus:TI (sign_extend:TI (match_dup 1))
 (sign_extend:TI (match_dup 2)))
(sign_extend:TI (plus:DI (match_dup 1)
 (match_dup 2
(const_int 0))]
   "TARGET_64BIT"
-  "@
-  add,tsv,* %2,%1,%0
-  addi,tsv,* %2,%1,%0"
-  [(set_attr "type" "binary,binary")
-   (set_attr "length" "4,4")])
+  "add,tsv,* %2,%1,%0"
+  [(set_attr "type" "binary")
+   (set_attr "length" "4")])
 
 (define_insn ""
   [(set (match_operand:DI 0 "register_operand" "=r")
@@ -5262,23 +5264,25 @@
 (match_dup 2
   (const_int 0))])]
   ""
-  "")
+  "
+{
+  if (TARGET_64BIT)
+operands[1] = force_reg (DImode, operands[1]);
+}")
 
 (define_insn ""
-  [(set (match_operand:DI 0 "register_operand" "=r,r")
-   (minus:DI (match_operand:DI 1 "arith11_operand" "r,I")
- (match_operand:DI 2 "reg_or_0_operand" "rM,rM")))
+  [(set (match_operand:DI 0 "register_operand" "=r")
+   (minus:DI (match_operand:DI 1 "register_operand" "r")
+ (match_operand:DI 2 "reg_or_0_operand" "rM")))
(trap_if (ne (minus:TI (sign_extend:TI (match_dup 1))
  (sign_extend:TI (match_dup 2)))
(sign_extend:TI (minus:DI (match_dup 1)
  (match_dup 2
(const_int 0))]
   "TARGET_64BIT"
-  "@
-  {subo|sub,tsv} %1,%2,%0
-  {subio|subi,tsv} %1,%2,%0"
-  [(set_attr "type" "binary,binary")
-   (set_attr "length" "4,4")])
+  "sub,tsv,* %1,%2,%0"
+  [(set_attr "type" "binary")
+   (set_attr "length" "4")])
 
 (define_insn ""
   [(set (match_operand:DI 0 "register_operand" "=r,&r")


signature.asc
Description: PGP signature

Re: Java front-end and library patches.

2022-11-30 Thread Joseph Myers

On Wed, 30 Nov 2022, Zopolis0 via Gcc-patches wrote:

> > * Each patch should have its own explanation of what it is doing and why,
> > in the message body (not in an attachment).  Just the commit summary line
> > and ChangeLog entries aren't enough, we need the actual substantive commit
> > message explaining the patch.
> 
> The thing is, most of the patches do not need an explanation. Patches
> 1-13 are just re-adding code,

Then state that in the message body (with a reference to the commit that 
removed the code).

Just because code was removed in a given form doesn't mean it should be 
added back in that form.  For example, patch 13, "Re-add 
flag_evaluation_order, reorder_operands_p, and add reorder bool argument 
to tree_swap_operands_p", seems suspicious.  That sort of global state 
affecting IR semantics is best avoided; rather, the Java gimplification 
support should deal with ensuring the correct ordering for operations in 
the GIMPLE generated.  Note that C++ flag_strong_eval_order (for C++17 
evaluation order requirements) is specific to the front end; it doesn't 
require anything in expr.cc or fold-const.cc or other language-independent 
files.  So you should do something similar for Java rather than adding 
back global language-independent state for this.

Patches 1 and 2 don't seem to have reached the mailing list.

> 20-43 and 47 are just applying treewide
> changes that Java missed out on,

So say for each one exactly which commit it's applying the changes for.

> > How has the series been validated?
> 
> I'm not exactly sure what you mean by this.

What target triplets did you run the GCC testsuite on (before and after 
the changes, with no regressions), with what results for the Java-specific 
tests?

> I plan to
> replace Classpath with the OpenJDK, and double down on the machine
> code aspect of GCJ, dropping bytecode and interpreted support.

This sort of thing is key information to include in the summary message 
for any future versions of the patch series.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH] libstdc++: Add error handler for

2022-11-30 Thread François Dumont via Gcc-patches


On 30/11/22 14:07, Jonathan Wakely wrote:

On Wed, 30 Nov 2022 at 11:57, Jonathan Wakely  wrote:



On Wed, 30 Nov 2022 at 11:54, Jonathan Wakely  wrote:



On Wed, 30 Nov 2022 at 06:04, François Dumont via Libstdc++ 
 wrote:

Good catch, then we also need this patch.


Is it worth printing an error? If we can't show the backtrace because of an 
error, we can just print nothing there.

We also need to pass an error handler to the __glibcxx_backtrace_create_state 
call in formatter.h.

Now that I look at this code again, why do we need the _M_backtrace_full 
member? It's always set to the same thing, why can't we just call that function 
directly?


Oh right, I remember now ... because otherwise the libstdc++.so library needs 
the definition of __glibcxx_backtrace_full.

I'm testing the attached patch.



And I think we should use threaded=1 for the __glibcxx_backtrace_create_state 
call.


You mean that 2 threads could try to assert at the same time.

I don't know what's the rule on the static _Error_formatter instance in 
_S_at. If we have a strong guaranty that only 1 instance will be created 
then I understand why we need threaded=1. Even if in this case the 2 
threads will report the same stacktrace.

[PATCH] longlong.h: Do no use asm input cast for clang

2022-11-30 Thread Adhemerval Zanella via Gcc-patches

clang by default rejects the input casts with:

  error: invalid use of a cast in a inline asm context requiring an
  lvalue: remove the cast or build with -fheinous-gnu-extensions

And even with -fheinous-gnu-extensions clang still throws an warning
and also states that this option might be removed in the future.
For gcc the cast are still useful somewhat [1], so just remove it
clang is used.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-October/581722.html
---
 include/ChangeLog  |  60 ++
 include/longlong.h | 524 +++--
 2 files changed, 325 insertions(+), 259 deletions(-)

diff --git a/include/ChangeLog b/include/ChangeLog
index dda005335c0..747fc923ef5 100644
--- a/include/ChangeLog
+++ b/include/ChangeLog
@@ -1,3 +1,63 @@
+2022-11-30  Adhemerval Zanella  
+
+   * include/longlong.h: Modified.
+   [(__GNUC__) && ! NO_ASM][( (__i386__) ||  (__i486__)) && W_TYPE_SIZE == 
32](add_ss): Modified.
+   [(__GNUC__) && ! NO_ASM][( (__i386__) ||  (__i486__)) && W_TYPE_SIZE == 
32](sub_ddmmss): Modified.
+   [(__GNUC__) && ! NO_ASM][( (__i386__) ||  (__i486__)) && W_TYPE_SIZE == 
32](umul_ppmm): Modified.
+   [(__GNUC__) && ! NO_ASM][( (__i386__) ||  (__i486__)) && W_TYPE_SIZE == 
32](udiv_qrnnd): Modified.
+   [(__GNUC__) && ! NO_ASM][(( (__sparc__) &&  (__arch64__)) ||  
(__sparcv9))  && W_TYPE_SIZE == 64](add_ss): Modified.
+   [(__GNUC__) && ! NO_ASM][(( (__sparc__) &&  (__arch64__)) ||  
(__sparcv9))  && W_TYPE_SIZE == 64](sub_ddmmss): Modified.
+   [(__GNUC__) && ! NO_ASM][(( (__sparc__) &&  (__arch64__)) ||  
(__sparcv9))  && W_TYPE_SIZE == 64](umul_ppmm): Modified.
+   [(__GNUC__) && ! NO_ASM][(__M32R__) && W_TYPE_SIZE == 32](add_ss): 
Modified.
+   [(__GNUC__) && ! NO_ASM][(__M32R__) && W_TYPE_SIZE == 32](sub_ddmmss): 
Modified.
+   [(__GNUC__) && ! NO_ASM][(__arc__) && W_TYPE_SIZE == 32](add_ss): 
Modified.
+   [(__GNUC__) && ! NO_ASM][(__arc__) && W_TYPE_SIZE == 32](sub_ddmmss): 
Modified.
+   [(__GNUC__) && ! NO_ASM][(__arm__) && ( (__thumb2__) || ! __thumb__)  
&& W_TYPE_SIZE == 32][(__ARM_ARCH_2__) || (__ARM_ARCH_2A__)  || 
(__ARM_ARCH_3__)](umul_ppmm): Modified.
+   [(__GNUC__) && ! NO_ASM][(__arm__) && ( (__thumb2__) || ! __thumb__)  
&& W_TYPE_SIZE == 32](add_ss): Modified.
+   [(__GNUC__) && ! NO_ASM][(__arm__) && ( (__thumb2__) || ! __thumb__)  
&& W_TYPE_SIZE == 32](sub_ddmmss): Modified.
+   [(__GNUC__) && ! NO_ASM][(__hppa) && W_TYPE_SIZE == 32](add_ss): 
Modified.
+   [(__GNUC__) && ! NO_ASM][(__hppa) && W_TYPE_SIZE == 32](sub_ddmmss): 
Modified.
+   [(__GNUC__) && ! NO_ASM][(__i960__) && W_TYPE_SIZE == 32](umul_ppmm): 
Modified.
+   [(__GNUC__) && ! NO_ASM][(__i960__) && W_TYPE_SIZE == 32](__umulsidi3): 
Modified.
+   [(__GNUC__) && ! NO_ASM][(__ibm032__)  && W_TYPE_SIZE == 
32](add_ss): Modified.
+   [(__GNUC__) && ! NO_ASM][(__ibm032__)  && W_TYPE_SIZE == 
32](sub_ddmmss): Modified.
+   [(__GNUC__) && ! NO_ASM][(__ibm032__)  && W_TYPE_SIZE == 
32](umul_ppmm): Modified.
+   [(__GNUC__) && ! NO_ASM][(__ibm032__)  && W_TYPE_SIZE == 
32](count_leading_zeros): Modified.
+   [(__GNUC__) && ! NO_ASM][(__m88000__) && W_TYPE_SIZE == 
32][(__mc88110__)](umul_ppmm): Modified.
+   [(__GNUC__) && ! NO_ASM][(__m88000__) && W_TYPE_SIZE == 
32][(__mc88110__)](udiv_qrnnd): Modified.
+   [(__GNUC__) && ! NO_ASM][(__m88000__) && W_TYPE_SIZE == 
32](add_ss): Modified.
+   [(__GNUC__) && ! NO_ASM][(__m88000__) && W_TYPE_SIZE == 
32](sub_ddmmss): Modified.
+   [(__GNUC__) && ! NO_ASM][(__m88000__) && W_TYPE_SIZE == 
32](count_leading_zeros): Modified.
+   [(__GNUC__) && ! NO_ASM][(__mc68000__) && W_TYPE_SIZE == 
32][!((__mcoldfire__))](umul_ppmm): Modified.
+   [(__GNUC__) && ! NO_ASM][(__mc68000__) && W_TYPE_SIZE == 32][( 
(__mc68020__) && ! __mc68060__)](umul_ppmm): Modified.
+   [(__GNUC__) && ! NO_ASM][(__mc68000__) && W_TYPE_SIZE == 32][( 
(__mc68020__) && ! __mc68060__)](udiv_qrnnd): Modified.
+   [(__GNUC__) && ! NO_ASM][(__mc68000__) && W_TYPE_SIZE == 32][( 
(__mc68020__) && ! __mc68060__)](sdiv_qrnnd): Modified.
+   [(__GNUC__) && ! NO_ASM][(__mc68000__) && W_TYPE_SIZE == 
32][(__mcoldfire__)](umul_ppmm): Modified.
+   [(__GNUC__) && ! NO_ASM][(__mc68000__) && W_TYPE_SIZE == 
32](add_ss): Modified.
+   [(__GNUC__) && ! NO_ASM][(__mc68000__) && W_TYPE_SIZE == 
32](sub_ddmmss): Modified.
+   [(__GNUC__) && ! NO_ASM][(__sh__) && W_TYPE_SIZE == 32][! 
__sh1__](umul_ppmm): Modified.
+   [(__GNUC__) && ! NO_ASM][(__sparc__) && ! __arch64__ && ! __sparcv9  && 
W_TYPE_SIZE == 
32][!((__sparc_v9__))][!((__sparc_v8__))][!((__sparclite__))](umul_ppmm): 
Modified.
+   [(__GNUC__) && ! NO_ASM][(__sparc__) && ! __arch64__ && ! __sparcv9  && 
W_TYPE_SIZE == 
32][!((__sparc_v9__))][!((__sparc_v8__))][!((__sparclite__))](udiv_qrnnd): 
Modified.
+   [(__GNUC__) && ! NO_ASM][(__spa

Re: [PATCH] libstdc++: Add error handler for

2022-11-30 Thread François Dumont via Gcc-patches


On 30/11/22 14:07, Jonathan Wakely wrote:

On Wed, 30 Nov 2022 at 11:57, Jonathan Wakely  wrote:



On Wed, 30 Nov 2022 at 11:54, Jonathan Wakely  wrote:



On Wed, 30 Nov 2022 at 06:04, François Dumont via Libstdc++ 
 wrote:

Good catch, then we also need this patch.


Is it worth printing an error? If we can't show the backtrace because of an 
error, we can just print nothing there.


No strong opinion on that but if we do not print anything the output 
will be:


Backtrace:

Error: ...

I just considered that it did not cost much to report the issue to the 
user that defined _GLIBCXX_DEBUG_BACKTRACE and so is expecting a backtrace.


Maybe printing "Backtrace:\n" could be done in the normal callback 
leaving the user with the feeling that _GLIBCXX_DEBUG_BACKTRACE does not 
work.




We also need to pass an error handler to the __glibcxx_backtrace_create_state 
call in formatter.h.

Now that I look at this code again, why do we need the _M_backtrace_full 
member? It's always set to the same thing, why can't we just call that function 
directly?


Oh right, I remember now ... because otherwise the libstdc++.so library needs 
the definition of __glibcxx_backtrace_full.

I'm testing the attached patch.



And I think we should use threaded=1 for the __glibcxx_backtrace_create_state 
call.

So like the attached patch.

[PATCH] aarch64: Specify that FEAT_MOPS sequences clobber CC

2022-11-30 Thread Kyrylo Tkachov via Gcc-patches

Hi all,

According to the architecture pseudocode the FEAT_MOPS sequences overwrite the 
NZCV flags
as part of their operation, so GCC needs to model that in the relevant RTL 
patterns.
For the testcase:
void g();
void foo (int a, size_t N, char *__restrict__ in,
 char *__restrict__ out)
{
  if (a != 3)
__builtin_memcpy (out, in, N);
  if (a > 3)
g ();
}

we will currently generate:
foo:
cmp w0, 3
bne .L6
.L1:
ret
.L6:
cpyfp   [x3]!, [x2]!, x1!
cpyfm   [x3]!, [x2]!, x1!
cpyfe   [x3]!, [x2]!, x1!
ble .L1 // Flags reused after CPYF* sequence
b   g

This is wrong as the result of cmp needs to be recalculated after the MOPS 
sequence.
With this patch we'll insert a "cmp w0, 3" before the ble, like what clang does.

Bootstrapped and tested on aarch64-none-linux-gnu.
Pushing to trunk and to the GCC 12 branch after some baking time.

Thanks,
Kyrill

gcc/ChangeLog:

* config/aarch64/aarch64.md (aarch64_cpymemdi): Specify clobber of CC 
reg.
(*aarch64_cpymemdi): Likewise.
(aarch64_movmemdi): Likewise.
(aarch64_setmemdi): Likewise.
(*aarch64_setmemdi): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/mops_5.c: New test.
* gcc.target/aarch64/mops_6.c: Likewise.
* gcc.target/aarch64/mops_7.c: Likewise.


mops-cc.patch
Description: mops-cc.patch

Re: [PATCH 1/2] Fix C/107926: Wrong error message when initializing char array

2022-11-30 Thread Jakub Jelinek via Gcc-patches

On Wed, Nov 30, 2022 at 09:18:14AM -0800, apinski--- via Gcc-patches wrote:
> gcc/c/ChangeLog:
> 
>   PR c/107926
>   * c-typeck.cc (process_init_element):
>   Move the ceck for string cst until
>   after the error message.

Just a ChangeLog nit, not a patch review for which I defer to C FE
maintainers/reviewers.
s/ceck/check/, plus don't start the description uselessly on a next line
when half of it would fit on the first line after ):.
* c-typeck.cc (process_init_element): Move the check for string cst
until after the error message.

Jakub

[PATCH 2/2] Improve error message for excess elements in array initializer from {"a"}

2022-11-30 Thread apinski--- via Gcc-patches

From: Andrew Pinski 

So char arrays are not the only type that be initialized from {"a"}.
We can have wchar_t (L"") and char16_t (u"") types too. So let's
print out the type of the array instead of just saying char.

Note in the testsuite I used regex . to match '[' and ']' as
I could not figure out how many '\' I needed.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/c/ChangeLog:

* c-typeck.cc (process_init_element): Print out array type
for excessive elements.

gcc/testsuite/ChangeLog:

* gcc.dg/init-bad-1.c: Update error message.
* gcc.dg/init-bad-2.c: Likewise.
* gcc.dg/init-bad-3.c: Likewise.
* gcc.dg/init-excess-3.c: Likewise.
* gcc.dg/pr61096-1.c: Likewise.
---
 gcc/c/c-typeck.cc|  2 +-
 gcc/testsuite/gcc.dg/init-bad-1.c|  2 +-
 gcc/testsuite/gcc.dg/init-bad-2.c|  2 +-
 gcc/testsuite/gcc.dg/init-bad-3.c|  2 +-
 gcc/testsuite/gcc.dg/init-excess-3.c | 12 ++--
 gcc/testsuite/gcc.dg/pr61096-1.c |  2 +-
 6 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index 0fc382c..f1a1752 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -10631,7 +10631,7 @@ process_init_element (location_t loc, struct c_expr 
value, bool implicit,
 {
   if (constructor_stack->replacement_value.value)
{
- error_init (loc, "excess elements in % array initializer");
+ error_init (loc, "excess elements in %qT initializer", 
constructor_type);
  return;
}
   else if (string_flag)
diff --git a/gcc/testsuite/gcc.dg/init-bad-1.c 
b/gcc/testsuite/gcc.dg/init-bad-1.c
index 0da10c3..7c80006 100644
--- a/gcc/testsuite/gcc.dg/init-bad-1.c
+++ b/gcc/testsuite/gcc.dg/init-bad-1.c
@@ -18,7 +18,7 @@ char s[1] = "x";
 char s1[1] = { "x" };
 char t[1] = "xy"; /* { dg-warning "initializer-string for array of 'char' is 
too long" } */
 char t1[1] = { "xy" }; /* { dg-warning "initializer-string for array of 'char' 
is too long" } */
-char u[1] = { "x", "x" }; /* { dg-error "excess elements in 'char' array 
initializer" } */
+char u[1] = { "x", "x" }; /* { dg-error "excess elements in 'char.1.' 
initializer" } */
 /* { dg-message "near init" "near" { target *-*-* } .-1 } */
 
 int i = { };
diff --git a/gcc/testsuite/gcc.dg/init-bad-2.c 
b/gcc/testsuite/gcc.dg/init-bad-2.c
index 4775c48..57fd9f9 100644
--- a/gcc/testsuite/gcc.dg/init-bad-2.c
+++ b/gcc/testsuite/gcc.dg/init-bad-2.c
@@ -19,7 +19,7 @@ char s[1] = "x";
 char s1[1] = { "x" };
 char t[1] = "xy"; /* { dg-warning "initializer-string for array of 'char' is 
too long" } */
 char t1[1] = { "xy" }; /* { dg-warning "initializer-string for array of 'char' 
is too long" } */
-char u[1] = { "x", "x" }; /* { dg-error "excess elements in 'char' array 
initializer" } */
+char u[1] = { "x", "x" }; /* { dg-error "excess elements in 'char.1.' 
initializer" } */
 /* { dg-message "near init" "near" { target *-*-* } .-1 } */
 
 int j = { 1 };
diff --git a/gcc/testsuite/gcc.dg/init-bad-3.c 
b/gcc/testsuite/gcc.dg/init-bad-3.c
index c5c338d..c22e8ec 100644
--- a/gcc/testsuite/gcc.dg/init-bad-3.c
+++ b/gcc/testsuite/gcc.dg/init-bad-3.c
@@ -19,7 +19,7 @@ char s[1] = "x";
 char s1[1] = { "x" };
 char t[1] = "xy"; /* { dg-error "initializer-string for array of 'char' is too 
long" } */
 char t1[1] = { "xy" }; /* { dg-error "initializer-string for array of 'char' 
is too long" } */
-char u[1] = { "x", "x" }; /* { dg-error "excess elements in 'char' array 
initializer" } */
+char u[1] = { "x", "x" }; /* { dg-error "excess elements in 'char.1.' 
initializer" } */
 /* { dg-message "near init" "near" { target *-*-* } .-1 } */
 
 int j = { 1 };
diff --git a/gcc/testsuite/gcc.dg/init-excess-3.c 
b/gcc/testsuite/gcc.dg/init-excess-3.c
index 7741261..c03a984 100644
--- a/gcc/testsuite/gcc.dg/init-excess-3.c
+++ b/gcc/testsuite/gcc.dg/init-excess-3.c
@@ -4,12 +4,12 @@
 /* { dg-options "" } */
 
 
-char s0[] = {"abc",1}; /* { dg-error "array initializer|near init" } */
-char s1[] = {"abc","a"}; /* { dg-error "array initializer|near init" } */
-char s2[] = {1,"abc"}; /* { dg-error "array initializer|near init|computable 
at load time" } */
+char s0[] = {"abc",1}; /* { dg-error "'char..' initializer|near init" } */
+char s1[] = {"abc","a"}; /* { dg-error "'char..' initializer|near init" } */
+char s2[] = {1,"abc"}; /* { dg-error "'char..' initializer|near 
init|computable at load time" } */
 /* { dg-warning "integer from pointer without a cast" "" { target *-*-* } .-1 
} */
 
-char s3[5] = {"abc",1}; /* { dg-error "array initializer|near init" } */
-char s4[5] = {"abc","a"}; /* { dg-error "array initializer|near init" } */
-char s5[5] = {1,"abc"}; /* { dg-error "array initializer|near init|computable 
at load time" } */
+char s3[5] = {"abc",1}; /* { dg-error "'char.5.' initializer|near init" } */
+char s4[5] = {"abc","a"}; /* { dg-error "'char.5.' initializer|near init" } */
+char s5[5] = {1,"a

[PATCH 1/2] Fix C/107926: Wrong error message when initializing char array

2022-11-30 Thread apinski--- via Gcc-patches

From: Andrew Pinski 

The problem here is the code which handles {"a"} is supposed
to handle the case where the is something after the string but
it only handles the case where there is another string so
we go down the other path and error out saying "excess elements
in struct initializer" even though this was a character array.
To fix this, we need to move the ckeck if the initializer is
a string after the check for array and initializer.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

Thanks,
Adnrew Pinski

gcc/c/ChangeLog:

PR c/107926
* c-typeck.cc (process_init_element):
Move the ceck for string cst until
after the error message.

gcc/testsuite/ChangeLog:

PR c/107926
* gcc.dg/init-excess-3.c: New test.
---
 gcc/c/c-typeck.cc| 15 ++-
 gcc/testsuite/gcc.dg/init-excess-3.c | 15 +++
 2 files changed, 25 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/init-excess-3.c

diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index e06f052..0fc382c 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -10623,17 +10623,22 @@ process_init_element (location_t loc, struct c_expr 
value, bool implicit,
 
   /* Handle superfluous braces around string cst as in
  char x[] = {"foo"}; */
-  if (string_flag
-  && constructor_type
+  if (constructor_type
   && !was_designated
   && TREE_CODE (constructor_type) == ARRAY_TYPE
   && INTEGRAL_TYPE_P (TREE_TYPE (constructor_type))
   && integer_zerop (constructor_unfilled_index))
 {
   if (constructor_stack->replacement_value.value)
-   error_init (loc, "excess elements in % array initializer");
-  constructor_stack->replacement_value = value;
-  return;
+   {
+ error_init (loc, "excess elements in % array initializer");
+ return;
+   }
+  else if (string_flag)
+   {
+ constructor_stack->replacement_value = value;
+ return;
+   }
 }
 
   if (constructor_stack->replacement_value.value != NULL_TREE)
diff --git a/gcc/testsuite/gcc.dg/init-excess-3.c 
b/gcc/testsuite/gcc.dg/init-excess-3.c
new file mode 100644
index 000..7741261
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/init-excess-3.c
@@ -0,0 +1,15 @@
+/* Test for various cases of excess initializers for char arrays,
+   bug 107926. */
+/* { dg-do compile } */
+/* { dg-options "" } */
+
+
+char s0[] = {"abc",1}; /* { dg-error "array initializer|near init" } */
+char s1[] = {"abc","a"}; /* { dg-error "array initializer|near init" } */
+char s2[] = {1,"abc"}; /* { dg-error "array initializer|near init|computable 
at load time" } */
+/* { dg-warning "integer from pointer without a cast" "" { target *-*-* } .-1 
} */
+
+char s3[5] = {"abc",1}; /* { dg-error "array initializer|near init" } */
+char s4[5] = {"abc","a"}; /* { dg-error "array initializer|near init" } */
+char s5[5] = {1,"abc"}; /* { dg-error "array initializer|near init|computable 
at load time" } */
+/* { dg-warning "integer from pointer without a cast" "" { target *-*-* } .-1 
} */
-- 
1.8.3.1

[committed] d: Fix ICE on named continue label in an unrolled loop [PR107592]

2022-11-30 Thread Iain Buclaw via Gcc-patches

Hi,

This patch fixes an ICE with using `continue' on a named label in an
unrolled loop statement.

Continue labels in an unrolled loop require a unique label per
iteration.  Previously this used the Statement body node for each
unrolled iteration to generate a new entry in the label hash table.
This does not work when the continue label has an identifier, as said
named label is pointing to the outer UnrolledLoopStatement node.

What would happen is that during the lowering of `continue label', an
automatic label associated with the unrolled loop would be generated,
and a jump to that label inserted, but because it was never pushed by
the visitor for the loop itself, it subsequently never gets emitted.

To fix, correctly use the UnrolledLoopStatement as the key to look up
and store the break/continue label pair, but remove the continue label
from the value entry after every loop to force a new label to be
generated by the next call to `push_continue_label'

Bootstrapped and regression tested on x86_64-linux-gnu/-m32/-mx32, and
committed to mainline, and backported to previous release branches.

Regards
Iain.

---
PR d/107592

gcc/d/ChangeLog:

* toir.cc (IRVisitor::push_unrolled_continue_label): New method.
(IRVisitor::pop_unrolled_continue_label): New method.
(IRVisitor::visit (UnrolledLoopStatement *)): Use them instead of
push_continue_label and pop_continue_label.

gcc/testsuite/ChangeLog:

* gdc.dg/pr107592.d: New test.
---
 gcc/d/toir.cc   | 26 --
 gcc/testsuite/gdc.dg/pr107592.d | 13 +
 2 files changed, 37 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gdc.dg/pr107592.d

diff --git a/gcc/d/toir.cc b/gcc/d/toir.cc
index e5f5751f6db..e387f27760d 100644
--- a/gcc/d/toir.cc
+++ b/gcc/d/toir.cc
@@ -529,6 +529,28 @@ public:
 this->do_label (label);
   }
 
+  /* Generate and set a new continue label for the current unrolled loop.  */
+
+  void push_unrolled_continue_label (UnrolledLoopStatement *s)
+  {
+this->push_continue_label (s);
+  }
+
+  /* Finish with the continue label for the unrolled loop.  */
+
+  void pop_unrolled_continue_label (UnrolledLoopStatement *s)
+  {
+Statement *stmt = s->getRelatedLabeled ();
+d_label_entry *ent = d_function_chain->labels->get (stmt);
+gcc_assert (ent != NULL && ent->bc_label == true);
+
+this->pop_continue_label (TREE_VEC_ELT (ent->label, bc_continue));
+
+/* Remove the continue label from the label htab, as a new one must be
+   inserted at the end of every unrolled loop.  */
+ent->label = TREE_VEC_ELT (ent->label, bc_break);
+  }
+
   /* Visitor interfaces.  */
 
 
@@ -1089,9 +,9 @@ public:
 
if (statement != NULL)
  {
-   tree lcontinue = this->push_continue_label (statement);
+   this->push_unrolled_continue_label (s);
this->build_stmt (statement);
-   this->pop_continue_label (lcontinue);
+   this->pop_unrolled_continue_label (s);
  }
   }
 
diff --git a/gcc/testsuite/gdc.dg/pr107592.d b/gcc/testsuite/gdc.dg/pr107592.d
new file mode 100644
index 000..59f34477356
--- /dev/null
+++ b/gcc/testsuite/gdc.dg/pr107592.d
@@ -0,0 +1,13 @@
+// https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107592
+// { dg-do compile }
+
+void test107592(Things...)(Things things)
+{
+label:
+foreach (thing; things)
+{
+continue label;
+}
+}
+
+alias a107592 = test107592!(string);
-- 
2.37.2

Re: [PATCH] c++: Incremental fix for g++.dg/gomp/for-21.C [PR84469]

2022-11-30 Thread Jakub Jelinek via Gcc-patches

On Tue, Nov 29, 2022 at 11:05:33PM +0100, Jakub Jelinek wrote:
> On Tue, Nov 29, 2022 at 04:38:50PM -0500, Jason Merrill wrote:
> > > --- gcc/testsuite/g++.dg/gomp/for-21.C.jj 2020-01-12 11:54:37.178401867 
> > > +0100
> > > +++ gcc/testsuite/g++.dg/gomp/for-21.C2022-11-29 13:06:59.038410557 
> > > +0100
> > > @@ -54,9 +54,9 @@ void
> > >   f6 (S (&a)[10])
> > >   {
> > > #pragma omp for collapse (2)
> > > -  for (auto [i, j, k] : a)   // { dg-error "use of 
> > > 'i' before deduction of 'auto'" "" { target *-*-* } .-1 }
> > > +  for (auto [i, j, k] : a)   // { dg-error "use of 
> > > 'i' before deduction of 'auto'" }
> > >   for (int l = i; l < j; l += k)  // { dg-error "use of 
> > > 'j' before deduction of 'auto'" }
> > > -  ;  // { dg-error "use of 
> > > 'k' before deduction of 'auto'" "" { target *-*-* } .-3 }
> > > +  ;  // { dg-error "use of 
> > > 'k' before deduction of 'auto'" "" { target *-*-* } .-1 }
> > 
> > Hmm, this error is surprising: since the initializer is non-dependent, we
> > should have deduced immediately.  I'd expect the same error as in the
> > non-structured-binding cases, "* expression refers to iteration variable".
> 
> The reason was just to be consistent what is (unfortunately) emitted
> in the other cases (!processing_template_decl or type dependent).
> I guess I could try how much work would it be to deduce it sooner, but
> generally it is pretty corner case, people rarely do this in OpenMP code.

I had a look at that today, but it would be pretty hard.  The thing is
we must emit all the associated code for all the range for loops in
OpenMP loops at a different spot.  So, the only possibility I see would
be if we during parsing of a range for loop inside of the OpenMP loop nest
we don't do the cp_finish_omp_range_for stuff to avoid e.g. cp_finish_decl,
but instead
  build_x_indirect_ref (input_location, begin, RO_UNARY_STAR,
NULL_TREE, tf_none)
and if that gives a non-dependent type, temporarily overwrite TREE_TYPE
of the decl and if it is structured binding, temporarily
++processing_template_decl and cp_finish_decomp, then after parsing all the
associated loop headers we revert that (and ditto for instantiation of
OpenMP loops).

Jakub

Re: [PATCH 3/3] vect: inbranch SIMD clones

2022-11-30 Thread Jakub Jelinek via Gcc-patches

On Wed, Nov 30, 2022 at 03:17:30PM +, Andrew Stubbs wrote:
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/vect-simd-clone-16.c
> @@ -0,0 +1,89 @@
> +/* { dg-require-effective-target vect_simd_clones } */
> +/* { dg-additional-options "-fopenmp-simd -fdump-tree-optimized" } */
> +/* { dg-additional-options "-mavx" { target avx_runtime } } */
...
> +/* Ensure the the in-branch simd clones are used on targets that support
> +   them.  These counts include all call and definitions.  */
> +
> +/* { dg-skip-if "" { x86_64-*-* } { "-flto" } { "" } } */

Maybe better add -ffat-lto-objects to dg-additional-options and drop
the dg-skip-if (if it works with that, for all similar tests)?

> @@ -1063,7 +1064,8 @@ if_convertible_gimple_assign_stmt_p (gimple *stmt,
> A statement is if-convertible if:
> - it is an if-convertible GIMPLE_ASSIGN,
> - it is a GIMPLE_LABEL or a GIMPLE_COND,
> -   - it is builtins call.  */
> +   - it is builtins call.

s/call\./call,/ above

> +   - it is a call to a function with a SIMD clone.  */
>  
>  static bool
>  if_convertible_stmt_p (gimple *stmt, vec refs)
> @@ -1083,13 +1085,23 @@ if_convertible_stmt_p (gimple *stmt, 
> vec refs)
>   tree fndecl = gimple_call_fndecl (stmt);
>   if (fndecl)
> {
> + /* We can vectorize some builtins and functions with SIMD
> +"inbranch" clones.  */
>   int flags = gimple_call_flags (stmt);
> + struct cgraph_node *node = cgraph_node::get (fndecl);
>   if ((flags & ECF_CONST)
>   && !(flags & ECF_LOOPING_CONST_OR_PURE)
> - /* We can only vectorize some builtins at the moment,
> -so restrict if-conversion to those.  */
>   && fndecl_built_in_p (fndecl))
> return true;
> + else if (node && node->simd_clones != NULL)

I don't see much value in the "else " above, the if branch returns
if condition is true, so just
if (node && node->simd_clones != NULL)
would do it.

> +   /* Ensure that at least one clone can be "inbranch".  */
> +   for (struct cgraph_node *n = node->simd_clones; n != NULL;
> +n = n->simdclone->next_clone)
> + if (n->simdclone->inbranch)
> +   {
> + need_to_predicate = true;
> + return true;
> +   }
> }
>   return false;
>}
> @@ -2603,6 +2615,29 @@ predicate_statements (loop_p loop)
> gimple_assign_set_rhs1 (stmt, ifc_temp_var (type, rhs, &gsi));
> update_stmt (stmt);
>   }
> +
> +   /* Convert functions that have a SIMD clone to IFN_MASK_CALL.  This
> +  will cause the vectorizer to match the "in branch" clone variants,
> +  and serves to build the mask vector in a natural way.  */
> +   gcall *call = dyn_cast  (gsi_stmt (gsi));
> +   if (call && !gimple_call_internal_p (call))
> + {
> +   tree orig_fn = gimple_call_fn (call);
> +   int orig_nargs = gimple_call_num_args (call);
> +   auto_vec args;
> +   args.safe_push (orig_fn);
> +   for (int i=0; i < orig_nargs; i++)

Formatting - int i = 0;

> + args.safe_push (gimple_call_arg (call, i));
> +   args.safe_push (cond);
> +
> +   /* Replace the call with a IFN_MASK_CALL that has the extra
> +  condition parameter. */
> +   gcall *new_call = gimple_build_call_internal_vec (IFN_MASK_CALL,
> + args);
> +   gimple_call_set_lhs (new_call, gimple_call_lhs (call));
> +   gsi_replace (&gsi, new_call, true);
> + }
> +
> lhs = gimple_get_lhs (gsi_stmt (gsi));
> if (lhs && TREE_CODE (lhs) == SSA_NAME)
>   ssa_names.add (lhs);
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -3987,6 +3987,7 @@ vectorizable_simd_clone_call (vec_info *vinfo, 
> stmt_vec_info stmt_info,
>size_t i, nargs;
>tree lhs, rtype, ratype;
>vec *ret_ctor_elts = NULL;
> +  int arg_offset = 0;
>  
>/* Is STMT a vectorizable call?   */
>gcall *stmt = dyn_cast  (stmt_info->stmt);
> @@ -3994,6 +3995,16 @@ vectorizable_simd_clone_call (vec_info *vinfo, 
> stmt_vec_info stmt_info,
>  return false;
>  
>fndecl = gimple_call_fndecl (stmt);
> +  if (fndecl == NULL_TREE
> +  && gimple_call_internal_p (stmt)
> +  && gimple_call_internal_fn (stmt) == IFN_MASK_CALL)

Replace the above 2 lines with
  && gimple_call_internal_p (stmt, IFN_MASK_CALL))
?

> --- a/gcc/tree-vect-loop.cc   
>   
>   
> +++ b/gcc/tree-vect-loop.cc   
>   
>   
> @@ -2121,6 +2121,15 @@

[PATCH][OG12] amdgcn: Support AMD-specific 'isa' and 'arch' traits in OpenMP context selectors

2022-11-30 Thread Paul-Antoine Arras


Hi all,

This patch adds or fixes support for various AMD 'isa' and 'arch' trait 
selectors, so as to be consistent with LLVM. It also adds test cases 
checking all supported AMD ISAs are properly recognised when used in a 
'metadirective' construct.


This patch is closely related to 
https://gcc.gnu.org/r13-4403-g1fd508744eccda but cannot be committed to 
mainline because metadirectives and dynamic context selectors have not 
landed there yet.


Can this be committed to OG12?

Thanks,From 88522107dd39ba3ff8465cf688fe4438fa3b77b4 Mon Sep 17 00:00:00 2001
From: Paul-Antoine Arras 
Date: Wed, 30 Nov 2022 14:52:55 +0100
Subject: [PATCH] amdgcn: Support AMD-specific 'isa' and 'arch' traits in
 OpenMP context selectors

Add or fix libgomp support for 'amdgcn' as arch, and 'gfx908' and 'gfx90a' as 
isa traits.
Add test case for all supported 'isa' values used as context selectors in a 
metadirective construct..

libgomp/ChangeLog:

* config/gcn/selector.c (GOMP_evaluate_current_device): Recognise 
'amdgcn' as arch, and 'gfx908' and
'gfx90a' as isa traits.
* testsuite/libgomp.c-c++-common/metadirective-6.c: New test.
---
 libgomp/config/gcn/selector.c | 15 --
 .../libgomp.c-c++-common/metadirective-6.c| 48 +++
 2 files changed, 60 insertions(+), 3 deletions(-)
 create mode 100644 libgomp/testsuite/libgomp.c-c++-common/metadirective-6.c

diff --git libgomp/config/gcn/selector.c libgomp/config/gcn/selector.c
index 60793fc05d3..c948497c538 100644
--- libgomp/config/gcn/selector.c
+++ libgomp/config/gcn/selector.c
@@ -36,7 +36,7 @@ GOMP_evaluate_current_device (const char *kind, const char 
*arch,
   if (kind && strcmp (kind, "gpu") != 0)
 return false;
 
-  if (arch && strcmp (arch, "gcn") != 0)
+  if (arch && (strcmp (arch, "gcn") != 0 || strcmp (arch, "amdgcn") != 0))
 return false;
 
   if (!isa)
@@ -48,8 +48,17 @@ GOMP_evaluate_current_device (const char *kind, const char 
*arch,
 #endif
 
 #ifdef __GCN5__
-  if (strcmp (isa, "gfx900") == 0 || strcmp (isa, "gfx906") != 0
-  || strcmp (isa, "gfx908") == 0)
+  if (strcmp (isa, "gfx900") == 0 || strcmp (isa, "gfx906") != 0)
+return true;
+#endif
+
+#ifdef __CDNA1__
+  if (strcmp (isa, "gfx908") == 0)
+return true;
+#endif
+
+#ifdef __CDNA2__
+  if (strcmp (isa, "gfx90a") == 0)
 return true;
 #endif
 
diff --git libgomp/testsuite/libgomp.c-c++-common/metadirective-6.c 
libgomp/testsuite/libgomp.c-c++-common/metadirective-6.c
new file mode 100644
index 000..6d169001db1
--- /dev/null
+++ libgomp/testsuite/libgomp.c-c++-common/metadirective-6.c
@@ -0,0 +1,48 @@
+/* { dg-do link { target { offload_target_amdgcn } } } */
+/* { dg-additional-options "-foffload=-fdump-tree-omp_expand_metadirective" } 
*/
+
+#define N 100
+
+void f (int x[], int y[], int z[])
+{
+  int i;
+
+  #pragma omp target map(to: x, y) map(from: z)
+#pragma omp metadirective \
+  when (device={isa("gfx803")}: teams num_teams(512)) \
+  when (device={isa("gfx900")}: teams num_teams(256)) \
+  when (device={isa("gfx906")}: teams num_teams(128)) \
+  when (device={isa("gfx908")}: teams num_teams(64)) \
+  when (device={isa("gfx90a")}: teams num_teams(32)) \
+  default (teams num_teams(4))
+   for (i = 0; i < N; i++)
+ z[i] = x[i] * y[i];
+}
+
+int main (void)
+{
+  int x[N], y[N], z[N];
+  int i;
+
+  for (i = 0; i < N; i++)
+{
+  x[i] = i;
+  y[i] = -i;
+}
+
+  f (x, y, z);
+
+  for (i = 0; i < N; i++)
+if (z[i] != x[i] * y[i])
+  return 1;
+
+  return 0;
+}
+
+/* The metadirective should be resolved after Gimplification.  */
+
+/* { dg-final { scan-offload-tree-dump "__builtin_GOMP_teams4 \\(512, 512" 
"omp_expand_metadirective" { target { any-opts "-foffload=-march=fiji" } } } } 
*/
+/* { dg-final { scan-offload-tree-dump "__builtin_GOMP_teams4 \\(256, 256" 
"omp_expand_metadirective" { target { any-opts "-foffload=-march=gfx900" } } } 
} */
+/* { dg-final { scan-offload-tree-dump "__builtin_GOMP_teams4 \\(128, 128" 
"omp_expand_metadirective" { target { any-opts "-foffload=-march=gfx906" } } } 
} */
+/* { dg-final { scan-offload-tree-dump "__builtin_GOMP_teams4 \\(64, 64" 
"omp_expand_metadirective" { target { any-opts "-foffload=-march=gfx908" } } } 
} */
+/* { dg-final { scan-offload-tree-dump "__builtin_GOMP_teams4 \\(32, 32" 
"omp_expand_metadirective" { target { any-opts "-foffload=-march=gfx90a" } } } 
} */
-- 
2.31.1

Re: [PATCH 3/3] vect: inbranch SIMD clones

2022-11-30 Thread Andrew Stubbs


On 09/09/2022 15:31, Jakub Jelinek wrote:

--- a/gcc/tree-if-conv.cc
+++ b/gcc/tree-if-conv.cc
@@ -1074,13 +1076,19 @@ if_convertible_stmt_p (gimple *stmt, 
vec refs)
tree fndecl = gimple_call_fndecl (stmt);
if (fndecl)
  {
+   /* We can vectorize some builtins and functions with SIMD
+  clones.  */
int flags = gimple_call_flags (stmt);
+   struct cgraph_node *node = cgraph_node::get (fndecl);
if ((flags & ECF_CONST)
&& !(flags & ECF_LOOPING_CONST_OR_PURE)
-   /* We can only vectorize some builtins at the moment,
-  so restrict if-conversion to those.  */
&& fndecl_built_in_p (fndecl))
  return true;
+   else if (node && node->simd_clones != NULL)
+ {
+   need_to_predicate = true;


I think it would be worth it to check that at least one of the
node->simd_clones clones has ->inbranch set, because if all calls
are declare simd notinbranch, then predicating the loop will be just a
wasted effort.


+   return true;
+ }
  }
return false;
}
@@ -2614,6 +2622,31 @@ predicate_statements (loop_p loop)
  gimple_assign_set_rhs1 (stmt, ifc_temp_var (type, rhs, &gsi));
  update_stmt (stmt);
}
+
+ /* Add a predicate parameter to functions that have a SIMD clone.
+This will cause the vectorizer to match the "in branch" clone
+variants because they also have the extra parameter, and serves
+to build the mask vector in a natural way.  */
+ gcall *call = dyn_cast  (gsi_stmt (gsi));
+ if (call && !gimple_call_internal_p (call))
+   {
+ tree orig_fndecl = gimple_call_fndecl (call);
+ int orig_nargs = gimple_call_num_args (call);
+ auto_vec args;
+ for (int i=0; i < orig_nargs; i++)
+   args.safe_push (gimple_call_arg (call, i));
+ args.safe_push (cond);
+
+ /* Replace the call with a new one that has the extra
+parameter.  The FUNCTION_DECL remains unchanged so that
+the vectorizer can find the SIMD clones.  This call will
+either be deleted or replaced at that time, so the
+mismatch is short-lived and we can live with it.  */
+ gcall *new_call = gimple_build_call_vec (orig_fndecl, args);
+ gimple_call_set_lhs (new_call, gimple_call_lhs (call));
+ gsi_replace (&gsi, new_call, true);


I think this is way too dangerous to represent conditional calls that way,
there is nothing to distinguish those from non-conditional calls.
I think I'd prefer (but please see what Richi thinks too) to represent
the conditional calls as a call to a new internal function, say
IFN_COND_CALL or IFN_MASK_CALL, which would have the arguments the original
call had, plus 2 extra ones first (or 3?), one that would be saved copy of
original gimple_call_fn (i.e. usually &fndecl), another one that would be the
condition (and dunno about whether we need also something to represent
gimple_call_fntype, or whether we simply should punt during ifcvt
on conditional calls where gimple_call_fntype is incompatible with
the function type of fndecl.  Another question is about
gimple_call_chain.  Punt or copy it over to the ifn and back.


The attached should resolve these issues.

OK for mainline?

Andrewvect: inbranch SIMD clones

There has been support for generating "inbranch" SIMD clones for a long time,
but nothing actually uses them (as far as I can see).

This patch add supports for a sub-set of possible cases (those using
mask_mode == VOIDmode).  The other cases fail to vectorize, just as before,
so there should be no regressions.

The sub-set of support should cover all cases needed by amdgcn, at present.

gcc/ChangeLog:

* internal-fn.cc (expand_MASK_CALL): New.
* internal-fn.def (MASK_CALL): New.
* internal-fn.h (expand_MASK_CALL): New prototype.
* omp-simd-clone.cc (simd_clone_adjust_argument_types): Set vector_type
for mask arguments also.
* tree-if-conv.cc: Include cgraph.h.
(if_convertible_stmt_p): Do if conversions for calls to SIMD calls.
(predicate_statements): Convert functions to IFN_MASK_CALL.
* tree-vect-loop.cc (vect_get_datarefs_in_loop): Recognise
IFN_MASK_CALL as a SIMD function call.
* tree-vect-stmts.cc (vectorizable_simd_clone_call): Handle
IFN_MASK_CALL as an inbranch SIMD function call.
Generate the mask vector arguments.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/vect-simd-clone-16.c: New test.
* gcc.dg/vect/vect-simd-clone-16b.c: New test.
* gcc.dg/vect/vect-simd-clone-16c.c: New test.
* gcc.dg/vect/vect-simd-clone-16d.c: New test.
* gcc.dg/vect/vect-simd-clone-16e.c: New test.
*

[V2][PATCH 1/1] Add a new warning option -Wstrict-flex-arrays.

2022-11-30 Thread Qing Zhao via Gcc-patches

'-Wstrict-flex-arrays'
 Warn about inproper usages of flexible array members according to
 the LEVEL of the 'strict_flex_array (LEVEL)' attribute attached to
 the trailing array field of a structure if it's available,
 otherwise according to the LEVEL of the option
 '-fstrict-flex-arrays=LEVEL'.

 This option is effective only when LEVEL is bigger than 0.
 Otherwise, it will be ignored with a warning.

 when LEVEL=1, warnings will be issued for a trailing array
 reference of a structure that have 2 or more elements if the
 trailing array is referenced as a flexible array member.

 when LEVEL=2, in addition to LEVEL=1, additional warnings will be
 issued for a trailing one-element array reference of a structure if
 the array is referenced as a flexible array member.

 when LEVEL=3, in addition to LEVEL=2, additional warnings will be
 issued for a trailing zero-length array reference of a structure if
 the array is referenced as a flexible array member.

At the same time, -Warray-bounds is updated:

 A. add the following to clarify the relationship with the LEVEL of
-fstrict-flex-array:

 By default, the trailing array of a structure will be treated as a
 flexible array member by '-Warray-bounds' or '-Warray-bounds=N' if
 it is declared as either a flexible array member per C99 standard
 onwards ('[]'), a GCC zero-length array extension ('[0]'), or an
 one-element array ('[1]').  As a result, out of bounds subscripts
 or offsets into zero-length arrays or one-element arrays are not
 warned by default.

 You can add the option '-fstrict-flex-arrays' or
 '-fstrict-flex-arrays=LEVEL' to control how this option treat
 trailing array of a structure as a flexible array member.

 when LEVEL<=1, no change to the default behavior.

 when LEVEL=2, additional warnings will be issued for out of bounds
 subscripts or offsets into one-element arrays;

 when LEVEL=3, in addition to LEVEL=2, additional warnings will be
 issued for out of bounds subscripts or offsets into zero-length
 arrays.

 B. change the -Warray-bounds=2 to exclude the control on how to treat
trailing arrays as flexible array members:

 '-Warray-bounds=2'
  This warning level also warns about the intermediate results
  of pointer arithmetic that may yield out of bounds values.
  This warning level may give a larger number of false positives
  and is deactivated by default.

gcc/ChangeLog:

* attribs.cc (strict_flex_array_level_of): New function.
* attribs.h (strict_flex_array_level_of): Prototype for new function.
* doc/invoke.texi: Document -Wstrict-flex-arrays option. Update
-Warray-bounds by specifying the impact from -fstrict-flex-arrays.
Also update -Warray-bounds=2 by eliminating its impact on treating
trailing arrays as flexible array members.
* gimple-array-bounds.cc (array_bounds_checker::check_array_ref):
Issue warnings for -Wstrict-flex-arrays.
(get_up_bounds_for_array_ref): New function.
(check_out_of_bounds_and_warn): New function.
* opts.cc (finish_options): Issue warning for unsupported combination
of -Wstrict_flex_arrays and -fstrict-flex-array.
* tree-vrp.cc (execute_ranger_vrp): Enable the pass when
warn_strict_flex_array is true.
* tree.cc (array_ref_flexible_size_p): Add one new argument.
(component_ref_sam_type): New function.
(component_ref_size): Control with level of strict-flex-array.
* tree.h (array_ref_flexible_size_p): Update prototype.
(enum struct special_array_member): Add two new enum values.
(component_ref_sam_type): New prototype.

gcc/c-family/ChangeLog:

* c.opt (Wstrict-flex-arrays): New option.

gcc/c/ChangeLog:

* c-decl.cc (is_flexible_array_member_p): Call new function
strict_flex_array_level_of.

gcc/testsuite/ChangeLog:

* c-c++-common/Wstrict-flex-arrays.c: New test.
* c-c++-common/Wstrict-flex-arrays_2.c: New test.
* gcc.dg/Warray-bounds-11.c: Update warnings for -Warray-bounds=2.
* gcc.dg/Wstrict-flex-arrays-2.c: New test.
* gcc.dg/Wstrict-flex-arrays-3.c: New test.
* gcc.dg/Wstrict-flex-arrays-4.c: New test.
* gcc.dg/Wstrict-flex-arrays-5.c: New test.
* gcc.dg/Wstrict-flex-arrays-6.c: New test.
* gcc.dg/Wstrict-flex-arrays-7.c: New test.
* gcc.dg/Wstrict-flex-arrays-8.c: New test.
* gcc.dg/Wstrict-flex-arrays-9.c: New test.
* gcc.dg/Wstrict-flex-arrays.c: New test.
---
 gcc/attribs.cc|  30 +++
 gcc/attribs.h |   2 +
 gcc/c-family/c.opt|   5 +
 gcc/c/c-decl.cc   |  22 +-
 gcc/doc/invoke.texi   |  54 +++-
 gcc

[V2][PATCH 0/1]Add a new warning option -Wstrict-flex-arrays

2022-11-30 Thread Qing Zhao via Gcc-patches

Hi, this is the 2nd version for this patch.

Per our discussion, I made the following change compared to the first
version:

1. The level of -Warray-bounds will NOT control how a trailing array 
   is considered as a flex array member anymore. Only the level of  
   -fstrict-flex-arrays will control this;
2. Updating the documentation for -Warray-bounds by clarifying this
   change.
3. Updating the testing cases for such change.

I have bootstrapped and regression tested on both X86 and aarch64
without any issue.

Okay for committing?

thanks.

Qing

Re: [PATCH 0/2] Support HWASAN with Intel LAM

2022-11-30 Thread Martin Liška

On 11/29/22 03:37, Hongtao Liu wrote:
> On Mon, Nov 28, 2022 at 10:40 PM Martin Liška  wrote:
>>
>> On 11/11/22 02:26, liuhongt via Gcc-patches wrote:
>>>2 years ago, ARM folks support HWASAN[1] in GCC[2], and introduced 
>>> several
>>> target hooks(Many thanks to their work) so other backends can do similar
>>> things if they have similar feature.
>>>Intel LAM(linear Address Masking)[3 Charpter 14] supports similar 
>>> feature with
>>> the upper bits of pointers can be used as metadata, LAM support two modes:
>>>LAM_U48:bits 48-62 can be used as metadata
>>>LAM_U57:bits 57-62 can be used as metedata.
>>>
>>> These 2 patches mainly support those target hooks, but HWASAN is not really
>>> enabled until the final decision for the LAM kernel interface which may take
>>> quite a long time. We have verified our patches with a "fake" interface 
>>> locally[4], and
>>> decided to push the backend patches to the GCC13 to make other HWASAN 
>>> developper's work
>>> easy.
>>
>> Hello.
>>
>> A few random comments I noticed:
>>
>> 1) please document the new target -mlam in extend.texi
> I will.

Thanks.

>> 2) the description speaks about bits [48-62] or [57-62], can explain why the 
>> patch contains:
>>
> Kernel will use bit 63 for special purposes, and here we want to
> extract the tag by shifting right the pointer 57 bits, and need to
> manually mask off bit63.

And thanks for the explanation.

Martin

>> +  /* Mask off bit63 when LAM_U57.  */
>> +  if (ix86_lam_type == lam_u57)
>> ?
>>
>> 3) Shouldn't the -lman option emit GNU_PROPERTY_X86_FEATURE_1_LAM_U57 or 
>> GNU_PROPERTY_X86_FEATURE_1_LAM_U48
>> .gnu.property note?
>>
>> 4) Can you please explain Florian's comment here:
>> https://gitlab.com/x86-psABIs/x86-64-ABI/-/merge_requests/13#note_1181396487
>>
>> Thanks,
>> Martin
>>
>>>
>>> [1] https://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html
>>> [2] https://gcc.gnu.org/pipermail/gcc-patches/2020-November/557857.html
>>> [3] 
>>> https://www.intel.com/content/dam/develop/external/us/en/documents/architecture-instruction-set-extensions-programming-reference.pdf
>>> [4] https://gitlab.com/x86-gcc/gcc/-/tree/users/intel/lam/master
>>>
>>>
>>> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
>>> Ok for trunk?
>>>
>>> liuhongt (2):
>>>Implement hwasan target_hook.
>>>Enable hwasan for x86-64.
>>>
>>>   gcc/config/i386/i386-expand.cc  |  12 
>>>   gcc/config/i386/i386-options.cc |   3 +
>>>   gcc/config/i386/i386-opts.h |   6 ++
>>>   gcc/config/i386/i386-protos.h   |   2 +
>>>   gcc/config/i386/i386.cc | 123 
>>>   gcc/config/i386/i386.opt|  16 +
>>>   libsanitizer/configure.tgt  |   1 +
>>>   7 files changed, 163 insertions(+)
>>>
>>
> 
>

[PATCH (pushed)] switch conversion: remove dead variable

2022-11-30 Thread Martin Liška

gcc/ChangeLog:

* tree-switch-conversion.cc (bit_test_cluster::emit): Remove
dead variable bt_range.
---
 gcc/tree-switch-conversion.cc | 2 --
 1 file changed, 2 deletions(-)

diff --git a/gcc/tree-switch-conversion.cc b/gcc/tree-switch-conversion.cc
index 83ba1c1ca03..1d75d7c7fc7 100644
--- a/gcc/tree-switch-conversion.cc
+++ b/gcc/tree-switch-conversion.cc
@@ -1518,7 +1518,6 @@ bit_test_cluster::emit (tree index_expr, tree index_type,
 
   tree minval = get_low ();
   tree maxval = get_high ();
-  unsigned HOST_WIDE_INT bt_range = get_range (minval, maxval);
 
   /* Go through all case labels, and collect the case labels, profile
  counts, and other information we need to build the branch tests.  */
@@ -1676,7 +1675,6 @@ bit_test_cluster::emit (tree index_expr, tree index_type,
 {
   profile_probability prob = test[k].prob / (subtree_prob + default_prob);
   subtree_prob -= test[k].prob;
-  bt_range -= test[k].bits;
   tmp = wide_int_to_tree (word_type_node, test[k].mask);
   tmp = fold_build2_loc (loc, BIT_AND_EXPR, word_type_node, csui, tmp);
   tmp = fold_build2_loc (loc, NE_EXPR, boolean_type_node,
-- 
2.38.1

RE: [PATCH v2] Add pattern to convert vector shift + bitwise and + multiply to vector compare in some cases.

2022-11-30 Thread Tamar Christina via Gcc-patches

> -Original Message-
> From: Gcc-patches  bounces+tamar.christina=arm@gcc.gnu.org> On Behalf Of Tamar
> Christina via Gcc-patches
> Sent: Wednesday, November 30, 2022 1:24 PM
> To: Richard Biener ; Manolis Tsamis
> 
> Cc: gcc-patches@gcc.gnu.org; Philipp Tomsich ;
> jiangning@amperecomputing.com; Christoph Muellner
> 
> Subject: RE: [PATCH v2] Add pattern to convert vector shift + bitwise and +
> multiply to vector compare in some cases.
> 
> > -Original Message-
> > From: Richard Biener 
> > Sent: Wednesday, November 30, 2022 1:19 PM
> > To: Manolis Tsamis 
> > Cc: gcc-patches@gcc.gnu.org; Philipp Tomsich
> > ; Tamar Christina ;
> > jiangning@amperecomputing.com; Christoph Muellner
> > 
> > Subject: Re: [PATCH v2] Add pattern to convert vector shift + bitwise
> > and + multiply to vector compare in some cases.
> >
> > On Wed, Nov 30, 2022 at 9:59 AM Manolis Tsamis
> > 
> > wrote:
> > >
> > > On Wed, Nov 30, 2022 at 9:44 AM Richard Biener
> > >  wrote:
> > > >
> > > > On Tue, Nov 29, 2022 at 11:05 AM Manolis Tsamis
> >  wrote:
> > > > >
> > > > > When using SWAR (SIMD in a register) techniques a comparison
> > > > > operation within such a register can be made by using a
> > > > > combination of shifts, bitwise and and multiplication. If code
> > > > > using this scheme is vectorized then there is potential to
> > > > > replace all these operations with a single vector comparison, by
> > > > > reinterpreting
> > the vector types to match the width of the SWAR register.
> > > > >
> > > > > For example, for the test function packed_cmp_16_32, the
> > > > > original
> > generated code is:
> > > > >
> > > > > ldr q0, [x0]
> > > > > add w1, w1, 1
> > > > > ushrv0.4s, v0.4s, 15
> > > > > and v0.16b, v0.16b, v2.16b
> > > > > shl v1.4s, v0.4s, 16
> > > > > sub v0.4s, v1.4s, v0.4s
> > > > > str q0, [x0], 16
> > > > > cmp w2, w1
> > > > > bhi .L20
> > > > >
> > > > > with this pattern the above can be optimized to:
> > > > >
> > > > > ldr q0, [x0]
> > > > > add w1, w1, 1
> > > > > cmltv0.8h, v0.8h, #0
> > > > > str q0, [x0], 16
> > > > > cmp w2, w1
> > > > > bhi .L20
> > > > >
> > > > > The effect is similar for x86-64.
> > > > >
> > > > > Signed-off-by: Manolis Tsamis 
> > > > >
> > > > > gcc/ChangeLog:
> > > > >
> > > > > * match.pd: Simplify vector shift + bit_and + multiply in some
> cases.
> > > > >
> > > > > gcc/testsuite/ChangeLog:
> > > > >
> > > > > * gcc.target/aarch64/swar_to_vec_cmp.c: New test.
> > > > >
> > > > > ---
> > > > >
> > > > > Changes in v2:
> > > > > - Changed pattern to use vec_cond_expr.
> > > > > - Changed pattern to work with VLA vector.
> > > > > - Added more checks and comments.
> > > > >
> > > > >  gcc/match.pd  | 60 
> > > > >  .../gcc.target/aarch64/swar_to_vec_cmp.c  | 72
> > +++
> > > > >  2 files changed, 132 insertions(+)  create mode 100644
> > > > > gcc/testsuite/gcc.target/aarch64/swar_to_vec_cmp.c
> > > > >
> > > > > diff --git a/gcc/match.pd b/gcc/match.pd index
> > > > > 67a0a682f31..05e7fc79ba8 100644
> > > > > --- a/gcc/match.pd
> > > > > +++ b/gcc/match.pd
> > > > > @@ -301,6 +301,66 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > > > >  (view_convert (bit_and:itype (view_convert @0)
> > > > >  (ne @1 { build_zero_cst (type);
> > > > > })))
> > > > >
> > > > > +/* In SWAR (SIMD in a register) code a signed comparison of
> > > > > +packed
> > data can
> > > > > +   be constructed with a particular combination of shift, bitwise 
> > > > > and,
> > > > > +   and multiplication by constants.  If that code is vectorized we 
> > > > > can
> > > > > +   convert this pattern into a more efficient vector comparison.
> > > > > +*/ (simplify  (mult (bit_and (rshift @0 uniform_integer_cst_p@1)
> > > > > +   uniform_integer_cst_p@2)
> > > > > +uniform_integer_cst_p@3)
> > > >
> > > > Please use VECTOR_CST in the match instead of
> > > > uniform_integer_cst_p and instead ...
> > > >
> > >
> > > Will do.
> > >
> > > > > + (with {
> > > > > +   tree rshift_cst = uniform_integer_cst_p (@1);
> > > > > +   tree bit_and_cst = uniform_integer_cst_p (@2);
> > > > > +   tree mult_cst = uniform_integer_cst_p (@3);  }
> > > > > +  /* Make sure we're working with vectors and uniform vector
> > > > > + constants.  */  (if (VECTOR_TYPE_P (type)
> > > >
> > > > ... test for non-NULL *_cst here where you can use
> > > > uniform_vector_p instead of uniform_integer_cst_p.  You can elide
> > > > the VECTOR_TYPE_P check then and instead do INTEGRAL_TYPE_P
> (TREE_TYPE (type)).
> > > >
> > >
> > > Will do.
> > >
> > > > > +   && tree_fits_uhwi_p (rshift_cst)
> > > > > +   && tree_fits_uhwi_p (mult_cst)
> > > > > +   && tree_fits_uhwi_p

[PATCH (pushed)] fix Clang warning

2022-11-30 Thread Martin Liška

Fixes:
gcc/fortran/parse.cc:5782:32: warning: for loop has empty body [-Wempty-body]

gcc/fortran/ChangeLog:

* parse.cc (parse_omp_structured_block): Remove extra semicolon.
---
 gcc/fortran/parse.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/fortran/parse.cc b/gcc/fortran/parse.cc
index 51ff0fc6ace..cdae43fa1fd 100644
--- a/gcc/fortran/parse.cc
+++ b/gcc/fortran/parse.cc
@@ -5779,7 +5779,7 @@ parse_omp_structured_block (gfc_statement omp_st, bool 
workshare_stmts_only)
{
  gfc_omp_namelist *nl;
  for (nl = cp->ext.omp_clauses->lists[OMP_LIST_COPYPRIVATE];
- nl->next; nl = nl->next);
+ nl->next; nl = nl->next)
;
  nl->next = new_st.ext.omp_clauses->lists[OMP_LIST_COPYPRIVATE];
}
-- 
2.38.1

Re: [PATCH] 0/19 modula-2 front end patches overview

2022-11-30 Thread Richard Biener via Gcc-patches

On Fri, Nov 11, 2022 at 3:02 PM Richard Biener
 wrote:
>
> On Mon, Oct 10, 2022 at 5:32 PM Gaius Mulley via Gcc-patches
>  wrote:
> >
> >
> > Here are the latest modula-2 front end patches for review.
> > The status of the patches and their contents are also contained at:
> >
> >https://splendidisolation.ddns.net/public/modula2/patchsummary.html
> >
> > where they are also broken down into topic groups.
> >
> > In summary the high level changes from the last posting are:
> >
> >* the driver code has been completely rewritten and it is now based
> >  on the fortran driver and the c++ driver.  The gm2 driver adds
> >  paths/libraries depending upon dialect chosen.
> >* the linking mechanism has been completely redesigned
> >  (As per
> >  https://gcc.gnu.org/pipermail/gcc-patches/2022-May/595725.html).
> >  Objects can be linked via g++.  New linking options
> >  are available to allow linking with/without a scaffold.
> >* gcc/m2/Make-lang.in (rewritten).
> >* gm2tools/ removed and any required functionality with the
> >  new linking mechanism has been moved into cc1gm2.
> >
> > The gm2 testsuite has been extended to test project linking
> > options.
>
> Thanks for these improvements!
>
> The frontend specific parts are a lot to digest and I think it isn't
> too important to
> wait for the unlikely event that all of that gets a review.  I'm
> trusting you here
> as a maintainer and also based on the use of the frontend out in the wild.
> I've CCed the other two RMs for their opinion on this.

There's consensus on this from at least the majority of the RMs now.

> I hope to get to the driver parts that I reviewed the last time, I'd
> appreciate a look
> on the runtime library setup by somebody else.
>
> I think it's important to get this (and the rust frontend) into the tree 
> before
> Christmas holidays so it gets exposed to the more weird treatment of some
> of our users (build wise).  This way we can develop either a negative or
> positive list of host/targets where to disable the new frontends.

So let's go ahead with the Modula-2 merge.  Gaius, can you post a final
series of the patches and perform the merging please?

Thus - OK to merge to trunk!

Thanks,
Richard.

> Thanks,
> Richard.
>
> >
> > Testing
> > ===
> >
> > 1. bootstrap on gcc-13 master --enable-languages=c,c++,fortran,d,lto
> >
> > 2. bootstrap on gcc-13 devel/modula-2 --enable-languages=c,c++,fortran,d,lto
> >no extra failures seen between contrib/compare_diffs 1 2
> >
> > 3. bootstrap on gcc-13 devel/modula-2 
> > --enable-languages=c,c++,fortran,d,lto,m2
> >no extra no m2 failures seen between contrib/compare_diffs 2 3
> >
> > Steps 1, 2, 3 were performed on amd64 and aarch64 systems.
> >
> > The devel/modula-2 branch has been bootstrapped on:
> >
> >amd64 (debian bullseye/suse leap, suse tumbleweed),
> >aarch64 (debian bullseye),
> >armv7l (raspian),
> >ppc64 (GNU/Linux),
> >ppc64le (GNU/Linux),
> >i586 (debian bullseye),
> >sparc64 solaris
> >sparc32 solaris
> >
> > and built on
> >
> >NetBSD 9.2 sparc64
> >OpenBSD amd64
> >
> > Sources
> > ===
> >
> > The patch set files follow in subsequent emails for review and copies
> > can be found in the tarball below.  For ease of testing the full front
> > end is also available via:
> >
> >   git clone git://gcc.gnu.org/git/gcc.git gcc-git-devel-modula2
> >   cd gcc-git-devel-modula2
> >   git checkout devel/modula-2
> >
> > The complete patch set is also available from:
> >
> >   https://splendidisolation.ddns.net/public/modula2/gm2patchset.tar.gz
> >
> > which can be applied to the gcc-13 master branch via:
> >
> >   git clone git://gcc.gnu.org/git/gcc.git gcc-git
> >   wget --no-check-certificate \
> >   https://splendidisolation.ddns.net/public/modula2/gm2patchset.tar.gz
> >   tar zxf gm2patchset.tar.gz
> >   bash gm2patchset/apply-patch.bash gcc-git
> >   bash gm2patchset/pre-configure.bash gcc-git # regenerates configure and 
> > friends
> >
> > when the script has completed the master branch should be identical
> > to git branch devel/modula-2 above modulo recent git master commits.
> >
> > Review Patch Set
> > 
> >
> > Here are all the source infrastructure files and all the c++/c sources
> > (minus the bootstrap tools as these are autogenerated from the
> > modula-2 sources).  I've not included the modula-2 sources (patch sets
> > 18 and 19) in these emails as an attempt to reduce the email volume.
> > They are available in
> > https://splendidisolation.ddns.net/public/modula2/gm2patchset.tar.gz
> > and of course the git repro.
> >
> > I'm happy to convert the documentation into sphynx and at a convenient
> > point would like to post the analyser patches for modula2.
> >
> > Thank you for reviewing the patches and thank you to all the testers
> >
> > regards,
> > Gaius

RE: [PATCH v2] Add pattern to convert vector shift + bitwise and + multiply to vector compare in some cases.

2022-11-30 Thread Tamar Christina via Gcc-patches

> -Original Message-
> From: Richard Biener 
> Sent: Wednesday, November 30, 2022 1:19 PM
> To: Manolis Tsamis 
> Cc: gcc-patches@gcc.gnu.org; Philipp Tomsich ;
> Tamar Christina ;
> jiangning@amperecomputing.com; Christoph Muellner
> 
> Subject: Re: [PATCH v2] Add pattern to convert vector shift + bitwise and +
> multiply to vector compare in some cases.
> 
> On Wed, Nov 30, 2022 at 9:59 AM Manolis Tsamis 
> wrote:
> >
> > On Wed, Nov 30, 2022 at 9:44 AM Richard Biener
> >  wrote:
> > >
> > > On Tue, Nov 29, 2022 at 11:05 AM Manolis Tsamis
>  wrote:
> > > >
> > > > When using SWAR (SIMD in a register) techniques a comparison
> > > > operation within such a register can be made by using a
> > > > combination of shifts, bitwise and and multiplication. If code
> > > > using this scheme is vectorized then there is potential to replace
> > > > all these operations with a single vector comparison, by reinterpreting
> the vector types to match the width of the SWAR register.
> > > >
> > > > For example, for the test function packed_cmp_16_32, the original
> generated code is:
> > > >
> > > > ldr q0, [x0]
> > > > add w1, w1, 1
> > > > ushrv0.4s, v0.4s, 15
> > > > and v0.16b, v0.16b, v2.16b
> > > > shl v1.4s, v0.4s, 16
> > > > sub v0.4s, v1.4s, v0.4s
> > > > str q0, [x0], 16
> > > > cmp w2, w1
> > > > bhi .L20
> > > >
> > > > with this pattern the above can be optimized to:
> > > >
> > > > ldr q0, [x0]
> > > > add w1, w1, 1
> > > > cmltv0.8h, v0.8h, #0
> > > > str q0, [x0], 16
> > > > cmp w2, w1
> > > > bhi .L20
> > > >
> > > > The effect is similar for x86-64.
> > > >
> > > > Signed-off-by: Manolis Tsamis 
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > > * match.pd: Simplify vector shift + bit_and + multiply in some 
> > > > cases.
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >
> > > > * gcc.target/aarch64/swar_to_vec_cmp.c: New test.
> > > >
> > > > ---
> > > >
> > > > Changes in v2:
> > > > - Changed pattern to use vec_cond_expr.
> > > > - Changed pattern to work with VLA vector.
> > > > - Added more checks and comments.
> > > >
> > > >  gcc/match.pd  | 60 
> > > >  .../gcc.target/aarch64/swar_to_vec_cmp.c  | 72
> +++
> > > >  2 files changed, 132 insertions(+)  create mode 100644
> > > > gcc/testsuite/gcc.target/aarch64/swar_to_vec_cmp.c
> > > >
> > > > diff --git a/gcc/match.pd b/gcc/match.pd index
> > > > 67a0a682f31..05e7fc79ba8 100644
> > > > --- a/gcc/match.pd
> > > > +++ b/gcc/match.pd
> > > > @@ -301,6 +301,66 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > > >  (view_convert (bit_and:itype (view_convert @0)
> > > >  (ne @1 { build_zero_cst (type);
> > > > })))
> > > >
> > > > +/* In SWAR (SIMD in a register) code a signed comparison of packed
> data can
> > > > +   be constructed with a particular combination of shift, bitwise and,
> > > > +   and multiplication by constants.  If that code is vectorized we can
> > > > +   convert this pattern into a more efficient vector comparison.
> > > > +*/ (simplify  (mult (bit_and (rshift @0 uniform_integer_cst_p@1)
> > > > +   uniform_integer_cst_p@2)
> > > > +uniform_integer_cst_p@3)
> > >
> > > Please use VECTOR_CST in the match instead of uniform_integer_cst_p
> > > and instead ...
> > >
> >
> > Will do.
> >
> > > > + (with {
> > > > +   tree rshift_cst = uniform_integer_cst_p (@1);
> > > > +   tree bit_and_cst = uniform_integer_cst_p (@2);
> > > > +   tree mult_cst = uniform_integer_cst_p (@3);  }
> > > > +  /* Make sure we're working with vectors and uniform vector
> > > > + constants.  */  (if (VECTOR_TYPE_P (type)
> > >
> > > ... test for non-NULL *_cst here where you can use uniform_vector_p
> > > instead of uniform_integer_cst_p.  You can elide the VECTOR_TYPE_P
> > > check then and instead do INTEGRAL_TYPE_P (TREE_TYPE (type)).
> > >
> >
> > Will do.
> >
> > > > +   && tree_fits_uhwi_p (rshift_cst)
> > > > +   && tree_fits_uhwi_p (mult_cst)
> > > > +   && tree_fits_uhwi_p (bit_and_cst))
> > > > +   /* Compute what constants would be needed for this to represent a
> packed
> > > > +  comparison based on the shift amount denoted by RSHIFT_CST.  */
> > > > +   (with {
> > > > + HOST_WIDE_INT vec_elem_bits = vector_element_bits (type);
> > > > + poly_int64 vec_nelts = TYPE_VECTOR_SUBPARTS (type);
> > > > + poly_int64 vec_bits = vec_elem_bits * vec_nelts;
> > > > +
> > > > + unsigned HOST_WIDE_INT cmp_bits_i, bit_and_i, mult_i;
> > > > + unsigned HOST_WIDE_INT target_mult_i, target_bit_and_i;
> > > > + cmp_bits_i = tree_to_uhwi (rshift_cst) + 1;
> > > > + target_mult_i = (HOST_WIDE_INT_1U << cmp_bits_i) - 1;
> > > > +
> > > > + mult_i = tree_to_uhwi (mult_cst);
> > > >

Re: [PATCH v2] Add pattern to convert vector shift + bitwise and + multiply to vector compare in some cases.

2022-11-30 Thread Richard Biener via Gcc-patches

On Wed, Nov 30, 2022 at 9:59 AM Manolis Tsamis  wrote:
>
> On Wed, Nov 30, 2022 at 9:44 AM Richard Biener
>  wrote:
> >
> > On Tue, Nov 29, 2022 at 11:05 AM Manolis Tsamis  
> > wrote:
> > >
> > > When using SWAR (SIMD in a register) techniques a comparison operation 
> > > within
> > > such a register can be made by using a combination of shifts, bitwise and 
> > > and
> > > multiplication. If code using this scheme is vectorized then there is 
> > > potential
> > > to replace all these operations with a single vector comparison, by 
> > > reinterpreting
> > > the vector types to match the width of the SWAR register.
> > >
> > > For example, for the test function packed_cmp_16_32, the original 
> > > generated code is:
> > >
> > > ldr q0, [x0]
> > > add w1, w1, 1
> > > ushrv0.4s, v0.4s, 15
> > > and v0.16b, v0.16b, v2.16b
> > > shl v1.4s, v0.4s, 16
> > > sub v0.4s, v1.4s, v0.4s
> > > str q0, [x0], 16
> > > cmp w2, w1
> > > bhi .L20
> > >
> > > with this pattern the above can be optimized to:
> > >
> > > ldr q0, [x0]
> > > add w1, w1, 1
> > > cmltv0.8h, v0.8h, #0
> > > str q0, [x0], 16
> > > cmp w2, w1
> > > bhi .L20
> > >
> > > The effect is similar for x86-64.
> > >
> > > Signed-off-by: Manolis Tsamis 
> > >
> > > gcc/ChangeLog:
> > >
> > > * match.pd: Simplify vector shift + bit_and + multiply in some 
> > > cases.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > * gcc.target/aarch64/swar_to_vec_cmp.c: New test.
> > >
> > > ---
> > >
> > > Changes in v2:
> > > - Changed pattern to use vec_cond_expr.
> > > - Changed pattern to work with VLA vector.
> > > - Added more checks and comments.
> > >
> > >  gcc/match.pd  | 60 
> > >  .../gcc.target/aarch64/swar_to_vec_cmp.c  | 72 +++
> > >  2 files changed, 132 insertions(+)
> > >  create mode 100644 gcc/testsuite/gcc.target/aarch64/swar_to_vec_cmp.c
> > >
> > > diff --git a/gcc/match.pd b/gcc/match.pd
> > > index 67a0a682f31..05e7fc79ba8 100644
> > > --- a/gcc/match.pd
> > > +++ b/gcc/match.pd
> > > @@ -301,6 +301,66 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > >  (view_convert (bit_and:itype (view_convert @0)
> > >  (ne @1 { build_zero_cst (type); })))
> > >
> > > +/* In SWAR (SIMD in a register) code a signed comparison of packed data 
> > > can
> > > +   be constructed with a particular combination of shift, bitwise and,
> > > +   and multiplication by constants.  If that code is vectorized we can
> > > +   convert this pattern into a more efficient vector comparison.  */
> > > +(simplify
> > > + (mult (bit_and (rshift @0 uniform_integer_cst_p@1)
> > > +   uniform_integer_cst_p@2)
> > > +uniform_integer_cst_p@3)
> >
> > Please use VECTOR_CST in the match instead of uniform_integer_cst_p
> > and instead ...
> >
>
> Will do.
>
> > > + (with {
> > > +   tree rshift_cst = uniform_integer_cst_p (@1);
> > > +   tree bit_and_cst = uniform_integer_cst_p (@2);
> > > +   tree mult_cst = uniform_integer_cst_p (@3);
> > > +  }
> > > +  /* Make sure we're working with vectors and uniform vector constants.  
> > > */
> > > +  (if (VECTOR_TYPE_P (type)
> >
> > ... test for non-NULL *_cst here where you can use uniform_vector_p instead
> > of uniform_integer_cst_p.  You can elide the VECTOR_TYPE_P check then
> > and instead do INTEGRAL_TYPE_P (TREE_TYPE (type)).
> >
>
> Will do.
>
> > > +   && tree_fits_uhwi_p (rshift_cst)
> > > +   && tree_fits_uhwi_p (mult_cst)
> > > +   && tree_fits_uhwi_p (bit_and_cst))
> > > +   /* Compute what constants would be needed for this to represent a 
> > > packed
> > > +  comparison based on the shift amount denoted by RSHIFT_CST.  */
> > > +   (with {
> > > + HOST_WIDE_INT vec_elem_bits = vector_element_bits (type);
> > > + poly_int64 vec_nelts = TYPE_VECTOR_SUBPARTS (type);
> > > + poly_int64 vec_bits = vec_elem_bits * vec_nelts;
> > > +
> > > + unsigned HOST_WIDE_INT cmp_bits_i, bit_and_i, mult_i;
> > > + unsigned HOST_WIDE_INT target_mult_i, target_bit_and_i;
> > > + cmp_bits_i = tree_to_uhwi (rshift_cst) + 1;
> > > + target_mult_i = (HOST_WIDE_INT_1U << cmp_bits_i) - 1;
> > > +
> > > + mult_i = tree_to_uhwi (mult_cst);
> > > + bit_and_i = tree_to_uhwi (bit_and_cst);
> > > + target_bit_and_i = 0;
> > > +
> > > + /* The bit pattern in BIT_AND_I should be a mask for the least
> > > +significant bit of each packed element that is CMP_BITS wide.  */
> > > + for (unsigned i = 0; i < vec_elem_bits / cmp_bits_i; i++)
> > > +   target_bit_and_i = (target_bit_and_i << cmp_bits_i) | 1U;
> > > +}
> > > +(if ((exact_log2 (cmp_bits_i)) >= 0
> > > +&& cmp_bits_i < HOST_BITS_PER_WIDE_INT
> > > +&& multiple_p (vec_bits, c

Re: [PATCH] libstdc++: Add error handler for

2022-11-30 Thread Jonathan Wakely via Gcc-patches

On Wed, 30 Nov 2022 at 11:57, Jonathan Wakely  wrote:
>
>
>
> On Wed, 30 Nov 2022 at 11:54, Jonathan Wakely  wrote:
>>
>>
>>
>> On Wed, 30 Nov 2022 at 06:04, François Dumont via Libstdc++ 
>>  wrote:
>>>
>>> Good catch, then we also need this patch.
>>
>>
>> Is it worth printing an error? If we can't show the backtrace because of an 
>> error, we can just print nothing there.
>>
>> We also need to pass an error handler to the 
>> __glibcxx_backtrace_create_state call in formatter.h.
>>
>> Now that I look at this code again, why do we need the _M_backtrace_full 
>> member? It's always set to the same thing, why can't we just call that 
>> function directly?
>
>
> Oh right, I remember now ... because otherwise the libstdc++.so library needs 
> the definition of __glibcxx_backtrace_full.

I'm testing the attached patch.


>
>>
>> And I think we should use threaded=1 for the 
>> __glibcxx_backtrace_create_state call.
>>
>> So like the attached patch.
>>
>>
commit 6c9cc05dc097f6ee66f18731a6247cce36823d54
Author: Jonathan Wakely 
Date:   Wed Nov 30 12:32:53 2022

libstdc++: Pass error handler to libbacktrace functions

Also pass threaded=1 to __glibcxx_backtrace_create_state and remove some
of the namespace scope declarations.

libstdc++-v3/ChangeLog:

* include/debug/formatter.h [_GLIBCXX_DEBUG_BACKTRACE]
(_Error_formatter::_Error_formatter): Pass error handler to
__glibcxx_backtrace_create_state. Pass 1 for threaded argument.
(_Error_formatter::_S_err): Define empty function.
* src/c++11/debug.cc (_Error_formatter::_M_error): Pass error
handler to __glibcxx_backtrace_full.

diff --git a/libstdc++-v3/include/debug/formatter.h 
b/libstdc++-v3/include/debug/formatter.h
index f120163c6d4..e8a83a21bde 100644
--- a/libstdc++-v3/include/debug/formatter.h
+++ b/libstdc++-v3/include/debug/formatter.h
@@ -32,32 +32,17 @@
 #include 
 
 #if _GLIBCXX_HAVE_STACKTRACE
-struct __glibcxx_backtrace_state;
-
 extern "C"
 {
-  __glibcxx_backtrace_state*
+  struct __glibcxx_backtrace_state*
   __glibcxx_backtrace_create_state(const char*, int,
   void(*)(void*, const char*, int),
   void*);
-
-  typedef int (*__glibcxx_backtrace_full_callback) (
-void*, __UINTPTR_TYPE__, const char *, int, const char*);
-
-  typedef void (*__glibcxx_backtrace_error_callback) (
-void*, const char*, int);
-
-  typedef int (*__glibcxx_backtrace_full_func) (
-__glibcxx_backtrace_state*, int,
-__glibcxx_backtrace_full_callback,
-__glibcxx_backtrace_error_callback,
-void*);
-
   int
   __glibcxx_backtrace_full(
-__glibcxx_backtrace_state*, int,
-__glibcxx_backtrace_full_callback,
-__glibcxx_backtrace_error_callback,
+struct __glibcxx_backtrace_state*, int,
+int (*)(void*, __UINTPTR_TYPE__, const char *, int, const char*),
+void (*)(void*, const char*, int),
 void*);
 }
 #endif
@@ -609,10 +594,10 @@ namespace __gnu_debug
 , _M_function(__function)
 #if _GLIBCXX_HAVE_STACKTRACE
 # ifdef _GLIBCXX_DEBUG_BACKTRACE
-, _M_backtrace_state(__glibcxx_backtrace_create_state(0, 0, 0, 0))
+, _M_backtrace_state(__glibcxx_backtrace_create_state(0, 1, _S_err, 0))
 , _M_backtrace_full(&__glibcxx_backtrace_full)
 # else
-, _M_backtrace_state()
+, _M_backtrace_state(0)
 # endif
 #endif
 { }
@@ -631,8 +616,12 @@ namespace __gnu_debug
 const char*_M_text;
 const char*_M_function;
 #if _GLIBCXX_HAVE_STACKTRACE
-__glibcxx_backtrace_state* _M_backtrace_state;
-__glibcxx_backtrace_full_func  _M_backtrace_full;
+struct __glibcxx_backtrace_state*  _M_backtrace_state;
+// TODO: Remove _M_backtrace_full after __glibcxx_backtrace_full is moved
+// from libstdc++_libbacktrace.a to libstdc++.so:
+__decltype(&__glibcxx_backtrace_full)  _M_backtrace_full;
+
+static void _S_err(void*, const char*, int) { }
 #endif
 
   public:
diff --git a/libstdc++-v3/src/c++11/debug.cc b/libstdc++-v3/src/c++11/debug.cc
index 9eda38023f7..c08eaa7f921 100644
--- a/libstdc++-v3/src/c++11/debug.cc
+++ b/libstdc++-v3/src/c++11/debug.cc
@@ -1193,7 +1193,7 @@ namespace __gnu_debug
   {
print_literal(ctx, "Backtrace:\n");
_M_backtrace_full(
- _M_backtrace_state, 1, print_backtrace, nullptr, &ctx);
+ _M_backtrace_state, 1, print_backtrace, _S_err, &ctx);
ctx._M_first_line = true;
print_literal(ctx, "\n");
   }

Re: [PATCH] libstdc++: Add error handler for

2022-11-30 Thread Jonathan Wakely via Gcc-patches

Resending with the typo in the mailing list address fixed...

On Wed, 30 Nov 2022 at 12:31, Jonathan Wakely  wrote:
>
> On Tue, 29 Nov 2022 at 21:41, Björn Schäpers wrote:
> >
> > From: Björn Schäpers 
> >
> > Not providing an error handler results in a nullpointer dereference when
> > an error occurs.
>
>
> Thanks for the patch. This looks small enough to not require legal
> paperwork, but if you intend to make further contributions (and I hope
> you do!) please note the process at https://gcc.gnu.org/dco.html or
> complete the paperwork for a copyright assignment to the FSF,
> whichever you prefer.
>
> I'm going to test and commit the attached patch, which replaces your
> __backtrace_error_handler with a static member function so that we
> don't make the name visible in the global namespace.
>
> Thanks again!
commit 0e32de31f34a88b941acd2f471ba6e8e945372cf
Author: Björn Schäpers 
Date:   Wed Nov 30 12:04:16 2022

libstdc++: Add error handler for 

Not providing an error handler results in a null pointer dereference
when an error occurs.

Co-authored-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* include/std/stacktrace (stacktrace_entry::_S_err_handler): New
static function.
(stacktrace_entry, basic_stacktrace): Pass &_S_err_handler to
all calls to libbacktrace.

diff --git a/libstdc++-v3/include/std/stacktrace 
b/libstdc++-v3/include/std/stacktrace
index e7cbbee5638..ec3335e89d8 100644
--- a/libstdc++-v3/include/std/stacktrace
+++ b/libstdc++-v3/include/std/stacktrace
@@ -155,11 +155,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 template friend class basic_stacktrace;
 
+static void _S_err_handler(void*, const char*, int) { }
+
 static __glibcxx_backtrace_state*
 _S_init()
 {
   static __glibcxx_backtrace_state* __state
-   = __glibcxx_backtrace_create_state(nullptr, 1, nullptr, nullptr);
+   = __glibcxx_backtrace_create_state(nullptr, 1, _S_err_handler, nullptr);
   return __state;
 }
 
@@ -192,7 +194,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  return __function != nullptr;
   };
   const auto __state = _S_init();
-  if (::__glibcxx_backtrace_pcinfo(__state, _M_pc, +__cb, nullptr, 
&__data))
+  if (::__glibcxx_backtrace_pcinfo(__state, _M_pc, +__cb, _S_err_handler,
+  &__data))
return true;
   if (__desc && __desc->empty())
{
@@ -201,8 +204,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  if (__symname)
*static_cast<_Data*>(__data)->_M_desc = _S_demangle(__symname);
  };
- if (::__glibcxx_backtrace_syminfo(__state, _M_pc, +__cb2, nullptr,
-   &__data))
+ if (::__glibcxx_backtrace_syminfo(__state, _M_pc, +__cb2,
+   _S_err_handler, &__data))
return true;
}
   return false;
@@ -252,7 +255,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
if (auto __cb = __ret._M_prepare()) [[likely]]
  {
auto __state = stacktrace_entry::_S_init();
-   if (__glibcxx_backtrace_simple(__state, 1, __cb, nullptr,
+   if (__glibcxx_backtrace_simple(__state, 1, __cb,
+  stacktrace_entry::_S_err_handler,
   std::__addressof(__ret)))
  __ret._M_clear();
  }
@@ -270,7 +274,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
if (auto __cb = __ret._M_prepare()) [[likely]]
  {
auto __state = stacktrace_entry::_S_init();
-   if (__glibcxx_backtrace_simple(__state, __skip + 1, __cb, nullptr,
+   if (__glibcxx_backtrace_simple(__state, __skip + 1, __cb,
+  stacktrace_entry::_S_err_handler,
   std::__addressof(__ret)))
  __ret._M_clear();
  }
@@ -294,7 +299,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  {
auto __state = stacktrace_entry::_S_init();
int __err = __glibcxx_backtrace_simple(__state, __skip + 1, __cb,
-  nullptr,
+  
stacktrace_entry::_S_err_handler,
   std::__addressof(__ret));
if (__err < 0)
  __ret._M_clear();

Re: Java front-end and library patches.

2022-11-30 Thread Xi Ruoyao via Gcc-patches

On Wed, 2022-11-30 at 23:18 +1100, Zopolis0 via Gcc-patches wrote:
> 20-43 and 47 are just applying treewide changes that Java missed out
> on

Add something like "adapt Java frontend for r11-1234 change" then.  So
the reviewer can take a look at https://gcc.gnu.org/r11-1234 and review
the change more easily.

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University

[PATCH 1/7] OpenMP/OpenACC: Refine condition for when map clause expansion happens

2022-11-30 Thread Julian Brown

This patch fixes some cases for OpenACC and OpenMP where map clauses were
being expanded (adding firstprivate_pointer, attach/detach nodes, and so
forth) unnecessarily, after the "OpenMP/OpenACC: Rework clause expansion
and nested struct handling" patch (approved but not yet committed):

  https://gcc.gnu.org/pipermail/gcc-patches/2022-October/603792.html

This is done by introducing a C_ORT_ACC_TARGET region type for OpenACC
compute regions to help distinguish them from non-compute regions that
need different handling, and by passing the region type through to the
clause expansion functions.

The patch also fixes clause expansion for OpenMP TO/FROM clauses, which
need to dereference references but not have any additional mapping nodes.

(These cases showed up due to the gimplification changes in the C++
"declare mapper" patch, but logically belong next to the earlier patch
named above.)

2022-11-30  Julian Brown  

gcc/
* c-family/c-common.h (c_omp_region_type): Add C_ORT_ACC_TARGET.
(c_omp_address_inspector): Pass c_omp_region_type instead of "target"
bool.
* c-family/c-omp.cc (c_omp_address_inspector::expand_array_base):
Adjust clause expansion for OpenACC and non-map (OpenMP to/from)
clauses.
(c_omp_address_inspector::expand_component_selector): Use
c_omp_region_type parameter.  Don't expand OpenMP to/from clauses.
(c_omp_address_inspector::expand_map_clause): Take ORT parameter, pass
to expand_array_base, etc.

gcc/c/
* c-parser.cc (c_parser_oacc_all_clauses): Add TARGET parameter. Use
to select region type for c_finish_omp_clauses call.
(c_parser_oacc_loop): Update calls to c_parser_oacc_all_clauses.
(c_parser_oacc_compute): Likewise.
* c-typeck.cc (handle_omp_array_sctions_1): Update for C_ORT_ACC_TARGET
addition and ai.expand_map_clause signature change.
(c_finish_omp_clauses): Likewise.

gcc/cp/
* parser.cc (cp_parser_oacc_all_clauses): Add TARGET parameter. Use
to select region type for finish_omp_clauses call.
(cp_parser_oacc_declare): Update call to cp_parser_oacc_all_clauses.
(cp_parser_oacc_loop): Update calls to cp_parser_oacc_all_clauses.
(cp_parser_oacc_compute): Likewise.
* pt.cc (tsubst_expr): Use C_ORT_ACC_TARGET for call to
tsubst_omp_clauses for compute regions.
* semantics.cc (handle_omp_array_sections_1): Update for
C_ORT_ACC_TARGET addition and ai.expand_map_clause signature change.
(finish_omp_clauses): Likewise.
---
 gcc/c-family/c-common.h | 10 +++--
 gcc/c-family/c-omp.cc   | 90 -
 gcc/c/c-parser.cc   | 15 ---
 gcc/c/c-typeck.cc   | 39 --
 gcc/cp/parser.cc| 15 ---
 gcc/cp/pt.cc|  4 +-
 gcc/cp/semantics.cc | 47 ++---
 7 files changed, 144 insertions(+), 76 deletions(-)

diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 14523fdefbc2..87e999becd5d 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1221,7 +1221,8 @@ enum c_omp_region_type
   C_ORT_DECLARE_SIMD   = 1 << 2,
   C_ORT_TARGET = 1 << 3,
   C_ORT_OMP_DECLARE_SIMD   = C_ORT_OMP | C_ORT_DECLARE_SIMD,
-  C_ORT_OMP_TARGET = C_ORT_OMP | C_ORT_TARGET
+  C_ORT_OMP_TARGET = C_ORT_OMP | C_ORT_TARGET,
+  C_ORT_ACC_TARGET = C_ORT_ACC | C_ORT_TARGET
 };
 
 extern tree c_finish_omp_master (location_t, tree);
@@ -1321,10 +1322,11 @@ public:
   bool maybe_zero_length_array_section (tree);
 
   tree expand_array_base (tree, vec &, tree, unsigned *,
- bool, bool);
+ c_omp_region_type, bool);
   tree expand_component_selector (tree, vec &, tree,
- unsigned *, bool);
-  tree expand_map_clause (tree, tree, vec &, bool);
+ unsigned *, c_omp_region_type);
+  tree expand_map_clause (tree, tree, vec &,
+ c_omp_region_type);
 };
 
 enum c_omp_directive_kind {
diff --git a/gcc/c-family/c-omp.cc b/gcc/c-family/c-omp.cc
index 7498c883be80..aab4ad9bed32 100644
--- a/gcc/c-family/c-omp.cc
+++ b/gcc/c-family/c-omp.cc
@@ -3369,7 +3369,8 @@ tree
 c_omp_address_inspector::expand_array_base (tree c,
vec &addr_tokens,
tree expr, unsigned *idx,
-   bool target, bool decl_p)
+   c_omp_region_type ort,
+   bool decl_p)
 {
   using namespace omp_addr_tokenizer;
   location_t loc = OMP_CLAUSE_LOCATION (c);
@@ -3379,14 +3380,26 @@ c_omp_address_inspector::expand_array_base (tree c,
   && is_global_var (decl)
   && lookup_attribute ("omp declare target",

[PATCH 2/2] OpenMP: C++ "declare mapper" support

2022-11-30 Thread Julian Brown

This is a new version of the patch to support OpenMP 5.0 "declare mapper"
functionality for C++.  As with the previously-posted version, arrays
of structs whose elements would be mapped via a user-defined mapper
remain unsupported.

(Previous versions were posted here:
  https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601560.html
  https://gcc.gnu.org/pipermail/gcc-patches/2022-March/591983.html)

This version of the patch uses a magic VAR_DECL instead of a magic
FUNCTION_DECL for representing mappers, which simplifies parsing
somewhat, and slightly reduces the number of places that need special-case
handling in the FE.  We use the DECL_INITIAL of the VAR_DECL to hold the
OMP_DECLARE_MAPPER definition.  To make types agree, we use the type of
the object to be mapped for both the var decl and the OMP_DECLARE_MAPPER
node itself.  Hence the OMP_DECLARE_MAPPER looks like a magic constant
of struct type in this particular case.

The magic var decl can go in all the places that the "declare mapper"
function decl went previously: at the top level of the program,
within a class definition (including template classes), and within a
function definition (including template functions).  In the class case
we conceptually use the C++-17-ism of definining the var decl "inline
static", equivalent to e.g.:

   [template ...]
   class bla {
 static inline omp declare mapper ... = #define omp declare mapper ..."
   };

(though of course we don't restrict the "declare mapper"-in-class syntax
to C++-17.)

The new representation necessitates some changes to template
instantiation.  In particular, declare mappers may trigger implicitly,
so we must make sure they are instantiated before they are needed (see
changes to mark_used, etc.).

I've rearranged the processing done by the gimplify_scan_omp_clauses and
gimplify_adjust_omp_clauses functions so the order of the phases can
remain intact in the presence of declared mappers.  To do this, most
gimplification of clauses in gimplify_scan_omp_clauses has been moved
to gimplify_adjust_omp_clauses.  This allows e.g. struct sibling-list
handling and topological clause sorting to work with the non-gimplified
form of clauses in the latter function -- including those that arise
from mapper expansion.  This seems to work well now.

Relative to the last-posted version, this patch brings forward various
refactoring that was previously done by the C and Fortran "declare mapper"
support patches -- aiming to reduce churn.  E.g. nested mapper finding
and mapper instantiation has been moved to c-family/c-omp.cc so it can
be shared between C and C++, and omp_name_type in omp-general.h (used
as the key to hash mapper definitions) is already templatized ready for
Fortran support.

This patch does not synthesize default mappers that map each of a struct's
elements individually: whole-struct mappings are still done by copying
the block of memory containing the struct.  That works fine apart from
cases where a struct has a member that is a reference (to a pointer).
We could fix that by synthesizing a mapper for such cases (only), but
that hasn't been attempted yet.  (I think that means Jakub's concerns
about blow-up of element mappings won't be a problem until that's done.)

New tests added in {gcc,libgomp}/c-c++-common have been restricted to
C++ for now, as the equivalent C parser changes to "declare mapper"
support are still TBD (as are any Fortran FE adjustments that might be
necessary due to various changes here and elsewhere).

This patch depends on the in-review or approved-but-pending-other-patches
patches:

  "OpenMP/OpenACC: Rework clause expansion and nested struct handling"
  https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601837.html
  (aka https://gcc.gnu.org/pipermail/gcc-patches/2022-October/603792.html)
  [approved, but breaks Fortran mapping a bit without...]

  "OpenMP: Pointers and member mappings"
  https://gcc.gnu.org/pipermail/gcc-patches/2022-October/603794.html
  [...this one, which is unreviewed]

  "OpenMP/OpenACC: Unordered/non-constant component offset runtime diagnostic"
  https://gcc.gnu.org/pipermail/gcc-patches/2022-October/603793.html
  [unreviewed also -- "QoI" improvement to above]

  "OpenMP: lvalue parsing for map/to/from clauses (C++)"
  https://gcc.gnu.org/pipermail/gcc-patches/2022-November/605367.html
  [reviewed, but needs a little revision]

2022-11-30  Julian Brown  

gcc/c-family/
* c-common.h (omp_mapper_list): Add forward declaration.
(c_omp_find_nested_mappers, c_omp_instantiate_mappers): Add prototypes.
* c-omp.cc (c_omp_find_nested_mappers): New function.
(remap_mapper_decl_info): New struct.
(remap_mapper_decl_1, omp_instantiate_mapper,
c_omp_instantiate_mappers): New functions.

gcc/cp/
* constexpr.cc (reduced_constant_expression_p): Add OMP_DECLARE_MAPPER
case.
(cxx_eval_constant_expression, potential_constant_expression_1):
Likewise.
* cp-g

[PATCH 0/2] C++ "declare mapper" support and map clause expansion fixes

2022-11-30 Thread Julian Brown

These two patches (posting as a "partial series" to avoid too much
duplication) comprise bug fixes for map clause expansion and a new version
of the patch to support OpenMP 5.0+ "declare mapper" directives for
C++. Hopefully previous review comments for the latter have been
adequately addressed.

These patches depend on various other patches that are not yet committed,
as described in the following emails.  (I may post the whole series
again after revising bits that already have review comments, if that'd
be helpful.)

Tested (with dependent patches) with offloading to NVPTX, and bootstrapped.

Julian Brown (~2):
  OpenMP/OpenACC: Refine condition for when map clause expansion happens
  [...]
  OpenMP: C++ "declare mapper" support

-- 
2.29.2

Re: Java front-end and library patches.

2022-11-30 Thread Zopolis0 via Gcc-patches

> * Each patch should have its own explanation of what it is doing and why,
> in the message body (not in an attachment).  Just the commit summary line
> and ChangeLog entries aren't enough, we need the actual substantive commit
> message explaining the patch.

The thing is, most of the patches do not need an explanation. Patches
1-13 are just re-adding code, 20-43 and 47 are just applying treewide
changes that Java missed out on, and patches 44-56 are either
incredibly simple or self-evident. If you feel like any of the listed
patches require an explanation, let me know and I will provide one,
but for now I dont see a reason to explain those.

However, patches 14-19 do need an explanation, as proven by multiple
reviews simply asking why I had made them. I'll send follow up
messages to those.

> Why is it now considered useful to add this front end back?

The way I see it, the Java front end was removed due to a lack of
maintenance and improvement. To put it simply, I am going to maintain
and improve it. That is the difference between now and then. There is
more nuance, but that is the gist of it.

> Which version is the basis of the one being added back...?

The exact same one that was removed from GCC, with the version taken
being the one right before it was removed.

> How has the series been validated?

I'm not exactly sure what you mean by this.

> Would you propose to maintain the front end and libraries in future?

I have big plans for the library, and plan to maintain that long into
the future. In regards to the actual front-end code, I will do what I
can to make sure it remains at its previous level of function, but
that is about it. I dislike working with the front end code, so I will
fix it, but I will not make sweeping changes to it.

>  Would you re-open any bugs against the front end or libraries that were 
> closed...as a result of it being removed from the tree...?

Good point, I hadn't thought of that. It makes sense to re-open them,
as they are by definition valid again, although I may have difficulty
with the frontend ones, as that is not my strong suit.




Just a brief overview of my plans for the frontend and library-- When
GCJ was first introduced it was "the free Java implementation". It was
trying to offer a bytecode compiler, a machine code compiler and a
runtime library. Clearly, this was too much, as it borrowed another
bytecode compiler and runtime library, and even then the runtime
library fell into dissaray.

Now, we have many pieces of the puzzle. We have a bounty of free Java
bytecode compilers, and a free runtime library. The only thing missing
is a free machine code compiler, which GCJ was and is. I plan to
replace Classpath with the OpenJDK, and double down on the machine
code aspect of GCJ, dropping bytecode and interpreted support.

[PATCH] tree-optimization/107919 - predicate simplification in uninit

2022-11-30 Thread Richard Biener via Gcc-patches

The testcase from the PR at -O2 shows

((_277 == 2) AND (_79 == 0))
OR ((NOT (_277 == 0)) AND (NOT (_277 > 2)) AND (NOT (_277 == 2)) AND (_79 
== 0))
OR ((NOT (pretmp_300 == 255)) AND (_277 == 0) AND (NOT (_277 > 2)) AND (NOT 
(_277 == 2)) AND (_79 == 0))

which we fail to simplify.  The following patch makes us simplify
the relations on _277, producing

((_79 == 0) AND (_277 == 2))
OR ((_79 == 0) AND (_277 <= 1) AND (NOT (_277 == 0)))
OR ((_79 == 0) AND (_277 == 0) AND (NOT (pretmp_300 == 255)))

which might be an incremental step to resolve a bogus uninit
diagnostic at -O2.  The patch uses maybe_fold_and_comparison for this.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/107919
* gimple-predicate-analysis.cc (simplify_1): Rename to ...
(simplify_1a): .. this.
(simplify_1b): New.
(predicate::simplify): Call both simplify_1a and simplify_1b.
---
 gcc/gimple-predicate-analysis.cc | 83 +---
 1 file changed, 76 insertions(+), 7 deletions(-)

diff --git a/gcc/gimple-predicate-analysis.cc b/gcc/gimple-predicate-analysis.cc
index 23be4b69bab..ce2e1d10e43 100644
--- a/gcc/gimple-predicate-analysis.cc
+++ b/gcc/gimple-predicate-analysis.cc
@@ -42,6 +42,7 @@
 #include "value-query.h"
 #include "cfganal.h"
 #include "tree-eh.h"
+#include "gimple-fold.h"
 
 #include "gimple-predicate-analysis.h"
 
@@ -1174,7 +1175,9 @@ compute_control_dep_chain (basic_block dom_bb, 
const_basic_block dep_bb,
 
 /* Implemented simplifications:
 
-   1) ((x IOR y) != 0) AND (x != 0) is equivalent to (x != 0);
+   1a) ((x IOR y) != 0) AND (x != 0) is equivalent to (x != 0);
+   1b) [!](X rel y) AND [!](X rel y') where y == y' or both constant
+   can possibly be simplified
2) (X AND Y) OR (!X AND Y) is equivalent to Y;
3) X OR (!X AND Y) is equivalent to (X OR Y);
4) ((x IAND y) != 0) || (x != 0 AND y != 0)) is equivalent to
@@ -1184,11 +1187,11 @@ compute_control_dep_chain (basic_block dom_bb, 
const_basic_block dep_bb,
 
PREDS is the predicate chains, and N is the number of chains.  */
 
-/* Implement rule 1 above.  PREDS is the AND predicate to simplify
+/* Implement rule 1a above.  PREDS is the AND predicate to simplify
in place.  */
 
 static void
-simplify_1 (pred_chain &chain)
+simplify_1a (pred_chain &chain)
 {
   bool simplified = false;
   pred_chain s_chain = vNULL;
@@ -1245,6 +1248,66 @@ simplify_1 (pred_chain &chain)
   chain = s_chain;
 }
 
+/* Implement rule 1b above.  PREDS is the AND predicate to simplify
+   in place.  Returns true if CHAIN simplifies to true.  */
+
+static bool
+simplify_1b (pred_chain &chain)
+{
+  for (unsigned i = 0; i < chain.length (); i++)
+{
+  pred_info &a_pred = chain[i];
+
+  for (unsigned j = i + 1; j < chain.length (); ++j)
+   {
+ pred_info &b_pred = chain[j];
+
+ if (!operand_equal_p (a_pred.pred_lhs, b_pred.pred_lhs)
+ || (!operand_equal_p (a_pred.pred_rhs, b_pred.pred_rhs)
+ && !(CONSTANT_CLASS_P (a_pred.pred_rhs)
+  && CONSTANT_CLASS_P (b_pred.pred_rhs
+   continue;
+
+ tree_code a_code = a_pred.cond_code;
+ if (a_pred.invert)
+   a_code = invert_tree_comparison (a_code, false);
+ tree_code b_code = b_pred.cond_code;
+ if (b_pred.invert)
+   b_code = invert_tree_comparison (b_code, false);
+ /* Try to combine X a_code Y && X b_code Y'.  */
+ tree comb = maybe_fold_and_comparisons (boolean_type_node,
+ a_code,
+ a_pred.pred_lhs,
+ a_pred.pred_rhs,
+ b_code,
+ b_pred.pred_lhs,
+ b_pred.pred_rhs, NULL);
+ if (!comb)
+   ;
+ else if (integer_zerop (comb))
+   return true;
+ else if (integer_truep (comb))
+   {
+ chain.ordered_remove (j);
+ chain.ordered_remove (i);
+ i--;
+ break;
+   }
+ else if (COMPARISON_CLASS_P (comb)
+  && operand_equal_p (a_pred.pred_lhs, TREE_OPERAND (comb, 0)))
+   {
+ chain.ordered_remove (j);
+ a_pred.cond_code = TREE_CODE (comb);
+ a_pred.pred_rhs = TREE_OPERAND (comb, 1);
+ a_pred.invert = false;
+ j--;
+   }
+   }
+}
+
+  return false;
+}
+
 /* Implements rule 2 for the OR predicate PREDS:
 
2) (X AND Y) OR (!X AND Y) is equivalent to Y.  */
@@ -1435,11 +1498,17 @@ predicate::simplify (gimple *use_or_def, bool is_use)
   dump (dump_file, use_or_def, is_use ? "[USE]:\n" : "[DEF]:\n");
 }
 
-  unsigned n = m_preds.length ();
-  for (unsigned i = 0; i < n;

[PATCH] Improve uninit diagnostic dumps

2022-11-30 Thread Richard Biener via Gcc-patches

The following dumps the edge a use is uninitialized in a PHI.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

* tree-ssa-uninit.cc (find_uninit_use): Dump the edge for a
PHI node.
---
 gcc/tree-ssa-uninit.cc | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/tree-ssa-uninit.cc b/gcc/tree-ssa-uninit.cc
index bf2e50511af..7fbc384f2d4 100644
--- a/gcc/tree-ssa-uninit.cc
+++ b/gcc/tree-ssa-uninit.cc
@@ -1249,8 +1249,8 @@ find_uninit_use (gphi *phi, unsigned uninit_opnds, int 
*bb_to_rpo)
 
  if (dump_file && (dump_flags & TDF_DETAILS))
{
- fprintf (dump_file, "Found unguarded use in bb %u: ",
-  use_bb->index);
+ fprintf (dump_file, "Found unguarded use on edge %u -> %u: ",
+  e->src->index, e->dest->index);
  print_gimple_stmt (dump_file, use_stmt, 0);
}
  /* Found a phi use that is not guarded, mark the phi_result as
-- 
2.35.3

[PATCH] tree-optimization/107919 - uninit diagnostic predicate simplification

2022-11-30 Thread Richard Biener via Gcc-patches

We fail to simplify

((_145 != 0B) AND (_531 == 2) AND (_109 == 0))
OR ((NOT (_145 != 0B)) AND (_531 == 2) AND (_109 == 0))
OR ((NOT (_531 == 2)) AND (_109 == 0))

because the existing simplification of !A && B || A && B is implemented
too simplistic.  The following re-implements that which fixes the
bogus uninit diagnostic when using -O1 but not yet at -O2.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/107919
* gimple-predicate-analysis.cc (predicate::simplify_2):
Handle predicates of arbitrary length.

* g++.dg/warn/Wuninitialized-pr107919-1.C: New testcase.
---
 gcc/gimple-predicate-analysis.cc  | 71 ---
 .../g++.dg/warn/Wuninitialized-pr107919-1.C   | 15 
 2 files changed, 43 insertions(+), 43 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/warn/Wuninitialized-pr107919-1.C

diff --git a/gcc/gimple-predicate-analysis.cc b/gcc/gimple-predicate-analysis.cc
index 5013a4447d6..23be4b69bab 100644
--- a/gcc/gimple-predicate-analysis.cc
+++ b/gcc/gimple-predicate-analysis.cc
@@ -1257,64 +1257,49 @@ predicate::simplify_2 ()
   /* (X AND Y) OR (!X AND Y) is equivalent to Y.
  (X AND Y) OR (X AND !Y) is equivalent to X.  */
 
-  unsigned n = m_preds.length ();
-  for (unsigned i = 0; i < n; i++)
+  for (unsigned i = 0; i < m_preds.length (); i++)
 {
   pred_chain &a_chain = m_preds[i];
-  if (a_chain.length () != 2)
-   continue;
-
-  /* Create copies since the chain may be released below before
-the copy is added to the other chain.  */
-  const pred_info x = a_chain[0];
-  const pred_info y = a_chain[1];
 
-  for (unsigned j = 0; j < n; j++)
+  for (unsigned j = i + 1; j < m_preds.length (); j++)
{
- if (j == i)
-   continue;
-
  pred_chain &b_chain = m_preds[j];
- if (b_chain.length () != 2)
+ if (b_chain.length () != a_chain.length ())
continue;
 
- const pred_info &x2 = b_chain[0];
- const pred_info &y2 = b_chain[1];
-
- if (pred_equal_p (x, x2) && pred_neg_p (y, y2))
+ unsigned neg_idx = -1U;
+ for (unsigned k = 0; k < a_chain.length (); ++k)
{
- /* Kill a_chain.  */
- b_chain.release ();
- a_chain.release ();
- b_chain.safe_push (x);
- simplified = true;
- break;
+ if (pred_equal_p (a_chain[k], b_chain[k]))
+   continue;
+ if (neg_idx != -1U)
+   {
+ neg_idx = -1U;
+ break;
+   }
+ if (pred_neg_p (a_chain[k], b_chain[k]))
+   neg_idx = k;
+ else
+   break;
}
- if (pred_neg_p (x, x2) && pred_equal_p (y, y2))
+ /* If we found equal chains with one negated predicate
+simplify.  */
+ if (neg_idx != -1U)
{
- /* Kill a_chain.  */
- a_chain.release ();
- b_chain.release ();
- b_chain.safe_push (y);
+ a_chain.ordered_remove (neg_idx);
+ m_preds.ordered_remove (j);
  simplified = true;
+ if (a_chain.is_empty ())
+   {
+ /* A && !A simplifies to true, wipe the whole predicate.  */
+ for (unsigned k = 0; k < m_preds.length (); ++k)
+   m_preds[k].release ();
+ m_preds.truncate (0);
+   }
  break;
}
}
 }
-  /* Now clean up the chain.  */
-  if (simplified)
-{
-  pred_chain_union s_preds = vNULL;
-  for (unsigned i = 0; i < n; i++)
-   {
- if (m_preds[i].is_empty ())
-   continue;
- s_preds.safe_push (m_preds[i]);
-   }
-  m_preds.release ();
-  m_preds = s_preds;
-  s_preds = vNULL;
-}
 
   return simplified;
 }
diff --git a/gcc/testsuite/g++.dg/warn/Wuninitialized-pr107919-1.C 
b/gcc/testsuite/g++.dg/warn/Wuninitialized-pr107919-1.C
new file mode 100644
index 000..dd631dc8bfe
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wuninitialized-pr107919-1.C
@@ -0,0 +1,15 @@
+// { dg-do compile }
+// { dg-require-effective-target c++17 }
+// { dg-options "-O -Wuninitialized" }
+
+#include 
+#include 
+
+using Event = std::variant>>, 
int, char>;
+
+void do_something(void* storage)
+{
+  Event event {};
+  auto& swappedValue = *reinterpret_cast(storage);
+  std::swap(event, swappedValue);
+}
-- 
2.35.3

Re: [Patch] libgomp.texi: List GCN's 'gfx803' under OpenMP Context Selectors (was: amdgcn: Support AMD-specific 'isa' traits in OpenMP context selectors)

2022-11-30 Thread Tobias Burnus




On 30.11.22 10:43, Andrew Stubbs wrote:

On 29/11/2022 18:26, Tobias Burnus wrote:

On 29.11.22 16:56, Paul-Antoine Arras wrote:

This patch adds support for 'gfx803' as an alias for 'fiji' in OpenMP
context selectors, [...]

PA committed that patch as
https://gcc.gnu.org/r13-4403-g1fd508744eccda9ad9c6d6fcce5b2ea9c568818d
(thanks!)

I think this should be documented somewhere. We have
https://gcc.gnu.org/onlinedocs/libgomp/OpenMP-Context-Selectors.html

The wording is a little odd.
How about "Additionally, gfx908 is supported as an alias for fiji"?


Committed with the suggested wording:
https://gcc.gnu.org/r13-4404-ge0b95c2e8b771b53876321a6a0a9497619af73cd

Thanks,

Tobias

PS: It does not help with finding a good wording if that's the last task
before calling it a day...

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955

Re: [PATCH] tree-chrec: Fix up ICE on pointer multiplication [PR107835]

2022-11-30 Thread Richard Biener via Gcc-patches

On Wed, 30 Nov 2022, Jakub Jelinek wrote:

> Hi!
> 
> r13-254-gdd3c7873a61019e9 added an optimization for {a, +, a} (x-1),
> but as can be seen on the following testcase, the way it is written
> where chrec_fold_multiply is called with type doesn't work for pointers:
>  res = build_int_cst (TREE_TYPE (x), 1);
>  res = chrec_fold_plus (TREE_TYPE (x), x, res);
>  res = chrec_convert_rhs (type, res, NULL);
>  res = chrec_fold_multiply (type, chrecr, res);
> while what we were doing before and what is still used if the condition
> doesn't match is fine:
>  res = chrec_convert_rhs (TREE_TYPE (chrecr), x, NULL);
>  res = chrec_fold_multiply (TREE_TYPE (chrecr), chrecr, res);
>  res = chrec_fold_plus (type, CHREC_LEFT (chrec), res);
> because it performs chrec_fold_multiply on TREE_TYPE (chrecr) and converts
> only afterwards.
> 
> I think the easiest fix is to ignore the new path for pointer types.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Thanks,
Richard.

> 2022-11-30  Jakub Jelinek  
> 
>   PR tree-optimization/107835
>   * tree-chrec.cc (chrec_apply): Don't handle "{a, +, a} (x-1)"
>   as "a*x" if type is a pointer type.
> 
>   * gcc.c-torture/compile/pr107835.c: New test.
> 
> --- gcc/tree-chrec.cc.jj  2022-05-10 18:33:14.641029951 +0200
> +++ gcc/tree-chrec.cc 2022-11-29 15:24:41.810400368 +0100
> @@ -622,7 +622,8 @@ chrec_apply (unsigned var,
> /* "{a, +, b} (x)"  ->  "a + b*x".  */
> else if (operand_equal_p (CHREC_LEFT (chrec), chrecr)
>  && TREE_CODE (x) == PLUS_EXPR
> -&& integer_all_onesp (TREE_OPERAND (x, 1)))
> +&& integer_all_onesp (TREE_OPERAND (x, 1))
> +&& !POINTER_TYPE_P (type))
>   {
> /* We know the number of iterations can't be negative.
>So {a, +, a} (x-1) -> "a*x".  */
> --- gcc/testsuite/gcc.c-torture/compile/pr107835.c.jj 2022-11-29 
> 15:31:32.565382590 +0100
> +++ gcc/testsuite/gcc.c-torture/compile/pr107835.c2022-11-29 
> 15:31:15.795628304 +0100
> @@ -0,0 +1,11 @@
> +/* PR tree-optimization/107835 */
> +
> +int *
> +foo (void)
> +{
> +  int *x = 0;
> +  unsigned n = n;
> +  for (; n; --n, ++x)
> +;
> +  return x;
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

Re: [PATCH] range-op: Implement floating point division fold_range [PR107569]

2022-11-30 Thread Iain Buclaw via Gcc-patches

Excerpts from Jakub Jelinek via Gcc-patches's message of November 11, 2022 
10:09 am:
> Hi!
> 
> Here is the floating point division fold_range implementation,
> as I wrote in the last mail, we could outline some of the common parts
> into static methods with descriptive names and share them between
> foperator_div and foperator_mult.
> 
> Bootstrapped/regtested on top of the earlier version of the multiplication
> fold_range on x86_64-linux and i686-linux, regressions are
> +FAIL: gcc.dg/pr95115.c execution test
> +FAIL: libphobos.phobos/std/math/hardware.d execution test
> +FAIL: libphobos.phobos_shared/std/math/hardware.d execution test

I've had some time to look at the Phobos failures, and seems to me that
it's a poorly written test.

pragma(inline, false) static void blockopt(ref real x) {}
real a = 3.5;
// Set all the flags to zero
resetIeeeFlags();
assert(!ieeeFlags.divByZero);
blockopt(a); // avoid constant propagation by the optimizer
// Perform a division by zero.
a /= 0.0L;
assert(a == real.infinity);
assert(ieeeFlags.divByZero);
blockopt(a); // avoid constant propagation by the optimizer


1. Since this patch, that `a /= 0.0L` operation no longer appears in the
final assembly - so no divide-by-zero flags are raised.

2. Whoever introduced blockopt() perhaps did not understand that
`a /= 0.0L` is not safe from constant propagation just because it is
surrounded by some uninlinable call.

I'll fix the test in upstream, it should really be something like:

pragma(inline, false)
static real forceDiv(real x, real y) { return x / y; }
a = forceDiv(a, 0.0L);
assert(a == real.infinity);
assert(ieeeFlags.divByZero);


Regards,
Iain.

[PATCH] c-family: Account for integral promotions of left shifts for -Wshift-overflow warning [PR107846]

2022-11-30 Thread Jakub Jelinek via Gcc-patches

Hi!

The r13-1100-gacb1e6f43dc2bbedd124 change added match.pd narrowing
of left shifts, and while I believe the C++ FE calls the warning on unfolded
trees, the C FE folds them and so left shifts where integral promotion
happened and so were done in int type will be usually narrowed back to
char/signed char/unsigned char/short/unsigned short left shifts if the
shift count is constant and fits into the precision of the var being
shifted.
One possibility would be to restrict the match.pd optimization to GIMPLE
only, another don't fold in C FE before this warning (well, we need to
fold the shift count operand to constant if possible), the following patch
just takes integral promotion into account in the warning code.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk,
or do you prefer some other way to resolve this?

2022-11-30  Jakub Jelinek  

PR c/107846
* c-warn.cc: Include langhooks.h.
(maybe_warn_shift_overflow): Set type0 to what TREE_TYPE (op0)
promotes to rather than TREE_TYPE (op0) itself, if TREE_TYPE (op0)
is narrower than type0 and unsigned, use wi::min_precision with
UNSIGNED and fold_convert op0 to type0 before emitting the warning.

* gcc.dg/pr107846.c: New test.

--- gcc/c-family/c-warn.cc.jj   2022-11-23 11:50:06.0 +0100
+++ gcc/c-family/c-warn.cc  2022-11-29 16:13:15.140713040 +0100
@@ -39,6 +39,7 @@ along with GCC; see the file COPYING3.
 #include "calls.h"
 #include "stor-layout.h"
 #include "tree-pretty-print.h"
+#include "langhooks.h"
 
 /* Print a warning if a constant expression had overflow in folding.
Invoke this function on every expression that the language
@@ -2615,14 +2616,19 @@ maybe_warn_shift_overflow (location_t lo
   || TREE_CODE (op1) != INTEGER_CST)
 return false;
 
-  tree type0 = TREE_TYPE (op0);
+  /* match.pd could have narrowed the left shift already,
+ take type promotion into account.  */
+  tree type0 = lang_hooks.types.type_promotes_to (TREE_TYPE (op0));
   unsigned int prec0 = TYPE_PRECISION (type0);
 
   /* Left-hand operand must be signed.  */
   if (TYPE_OVERFLOW_WRAPS (type0) || cxx_dialect >= cxx20)
 return false;
 
-  unsigned int min_prec = (wi::min_precision (wi::to_wide (op0), SIGNED)
+  signop sign = SIGNED;
+  if (TYPE_PRECISION (TREE_TYPE (op0)) < TYPE_PRECISION (type0))
+sign = TYPE_SIGN (TREE_TYPE (op0));
+  unsigned int min_prec = (wi::min_precision (wi::to_wide (op0), sign)
   + TREE_INT_CST_LOW (op1));
   /* Handle the case of left-shifting 1 into the sign bit.
* However, shifting 1 _out_ of the sign bit, as in
@@ -2645,7 +2651,8 @@ maybe_warn_shift_overflow (location_t lo
 warning_at (loc, OPT_Wshift_overflow_,
"result of %qE requires %u bits to represent, "
"but %qT only has %u bits",
-   build2_loc (loc, LSHIFT_EXPR, type0, op0, op1),
+   build2_loc (loc, LSHIFT_EXPR, type0,
+   fold_convert (type0, op0), op1),
min_prec, type0, prec0);
 
   return overflowed;
--- gcc/testsuite/gcc.dg/pr107846.c.jj  2022-11-29 16:18:34.427033919 +0100
+++ gcc/testsuite/gcc.dg/pr107846.c 2022-11-29 16:16:32.464821272 +0100
@@ -0,0 +1,14 @@
+/* PR c/107846 */
+/* { dg-do compile } */
+/* { dg-options "-Wall" } */
+
+#define foo(x, b, n, m) ((unsigned short) (x) << (b - (n + 1) * 8) >> (b - 8) 
<< (m * 8))
+#define bar(x) ((unsigned short) (foo (x, 16, 0, 1) | foo (x, 16, 1, 0)))
+#define baz(x) bar (x)
+static const int v = 8000;
+
+unsigned short
+qux (int t)
+{
+  return t != baz (v);
+}

Jakub

[PATCH] tree-chrec: Fix up ICE on pointer multiplication [PR107835]

2022-11-30 Thread Jakub Jelinek via Gcc-patches

Hi!

r13-254-gdd3c7873a61019e9 added an optimization for {a, +, a} (x-1),
but as can be seen on the following testcase, the way it is written
where chrec_fold_multiply is called with type doesn't work for pointers:
 res = build_int_cst (TREE_TYPE (x), 1);
 res = chrec_fold_plus (TREE_TYPE (x), x, res);
 res = chrec_convert_rhs (type, res, NULL);
 res = chrec_fold_multiply (type, chrecr, res);
while what we were doing before and what is still used if the condition
doesn't match is fine:
 res = chrec_convert_rhs (TREE_TYPE (chrecr), x, NULL);
 res = chrec_fold_multiply (TREE_TYPE (chrecr), chrecr, res);
 res = chrec_fold_plus (type, CHREC_LEFT (chrec), res);
because it performs chrec_fold_multiply on TREE_TYPE (chrecr) and converts
only afterwards.

I think the easiest fix is to ignore the new path for pointer types.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-11-30  Jakub Jelinek  

PR tree-optimization/107835
* tree-chrec.cc (chrec_apply): Don't handle "{a, +, a} (x-1)"
as "a*x" if type is a pointer type.

* gcc.c-torture/compile/pr107835.c: New test.

--- gcc/tree-chrec.cc.jj2022-05-10 18:33:14.641029951 +0200
+++ gcc/tree-chrec.cc   2022-11-29 15:24:41.810400368 +0100
@@ -622,7 +622,8 @@ chrec_apply (unsigned var,
  /* "{a, +, b} (x)"  ->  "a + b*x".  */
  else if (operand_equal_p (CHREC_LEFT (chrec), chrecr)
   && TREE_CODE (x) == PLUS_EXPR
-  && integer_all_onesp (TREE_OPERAND (x, 1)))
+  && integer_all_onesp (TREE_OPERAND (x, 1))
+  && !POINTER_TYPE_P (type))
{
  /* We know the number of iterations can't be negative.
 So {a, +, a} (x-1) -> "a*x".  */
--- gcc/testsuite/gcc.c-torture/compile/pr107835.c.jj   2022-11-29 
15:31:32.565382590 +0100
+++ gcc/testsuite/gcc.c-torture/compile/pr107835.c  2022-11-29 
15:31:15.795628304 +0100
@@ -0,0 +1,11 @@
+/* PR tree-optimization/107835 */
+
+int *
+foo (void)
+{
+  int *x = 0;
+  unsigned n = n;
+  for (; n; --n, ++x)
+;
+  return x;
+}

Jakub

Re: [Patch] libgomp.texi: List GCN's 'gfx803' under OpenMP Context Selectors (was: amdgcn: Support AMD-specific 'isa' traits in OpenMP context selectors)

2022-11-30 Thread Andrew Stubbs


On 29/11/2022 18:26, Tobias Burnus wrote:

Hi PA, hi Andrew, hi Jakub, hi all,

On 29.11.22 16:56, Paul-Antoine Arras wrote:

This patch adds support for 'gfx803' as an alias for 'fiji' in OpenMP
context selectors, [...]


I think this should be documented somewhere. We have
https://gcc.gnu.org/onlinedocs/libgomp/OpenMP-Context-Selectors.html

For GCN and ISA, it refers to -march= and gfx803 is only a context
selector. Hence:

How about the attached patch?


The wording is a little odd.

How about "Additionally, gfx908 is supported as an alias for fiji"?

Andrew

Re: [PATCH v2] Add pattern to convert vector shift + bitwise and + multiply to vector compare in some cases.

2022-11-30 Thread Manolis Tsamis

On Wed, Nov 30, 2022 at 9:44 AM Richard Biener
 wrote:
>
> On Tue, Nov 29, 2022 at 11:05 AM Manolis Tsamis  
> wrote:
> >
> > When using SWAR (SIMD in a register) techniques a comparison operation 
> > within
> > such a register can be made by using a combination of shifts, bitwise and 
> > and
> > multiplication. If code using this scheme is vectorized then there is 
> > potential
> > to replace all these operations with a single vector comparison, by 
> > reinterpreting
> > the vector types to match the width of the SWAR register.
> >
> > For example, for the test function packed_cmp_16_32, the original generated 
> > code is:
> >
> > ldr q0, [x0]
> > add w1, w1, 1
> > ushrv0.4s, v0.4s, 15
> > and v0.16b, v0.16b, v2.16b
> > shl v1.4s, v0.4s, 16
> > sub v0.4s, v1.4s, v0.4s
> > str q0, [x0], 16
> > cmp w2, w1
> > bhi .L20
> >
> > with this pattern the above can be optimized to:
> >
> > ldr q0, [x0]
> > add w1, w1, 1
> > cmltv0.8h, v0.8h, #0
> > str q0, [x0], 16
> > cmp w2, w1
> > bhi .L20
> >
> > The effect is similar for x86-64.
> >
> > Signed-off-by: Manolis Tsamis 
> >
> > gcc/ChangeLog:
> >
> > * match.pd: Simplify vector shift + bit_and + multiply in some 
> > cases.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/aarch64/swar_to_vec_cmp.c: New test.
> >
> > ---
> >
> > Changes in v2:
> > - Changed pattern to use vec_cond_expr.
> > - Changed pattern to work with VLA vector.
> > - Added more checks and comments.
> >
> >  gcc/match.pd  | 60 
> >  .../gcc.target/aarch64/swar_to_vec_cmp.c  | 72 +++
> >  2 files changed, 132 insertions(+)
> >  create mode 100644 gcc/testsuite/gcc.target/aarch64/swar_to_vec_cmp.c
> >
> > diff --git a/gcc/match.pd b/gcc/match.pd
> > index 67a0a682f31..05e7fc79ba8 100644
> > --- a/gcc/match.pd
> > +++ b/gcc/match.pd
> > @@ -301,6 +301,66 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >  (view_convert (bit_and:itype (view_convert @0)
> >  (ne @1 { build_zero_cst (type); })))
> >
> > +/* In SWAR (SIMD in a register) code a signed comparison of packed data can
> > +   be constructed with a particular combination of shift, bitwise and,
> > +   and multiplication by constants.  If that code is vectorized we can
> > +   convert this pattern into a more efficient vector comparison.  */
> > +(simplify
> > + (mult (bit_and (rshift @0 uniform_integer_cst_p@1)
> > +   uniform_integer_cst_p@2)
> > +uniform_integer_cst_p@3)
>
> Please use VECTOR_CST in the match instead of uniform_integer_cst_p
> and instead ...
>

Will do.

> > + (with {
> > +   tree rshift_cst = uniform_integer_cst_p (@1);
> > +   tree bit_and_cst = uniform_integer_cst_p (@2);
> > +   tree mult_cst = uniform_integer_cst_p (@3);
> > +  }
> > +  /* Make sure we're working with vectors and uniform vector constants.  */
> > +  (if (VECTOR_TYPE_P (type)
>
> ... test for non-NULL *_cst here where you can use uniform_vector_p instead
> of uniform_integer_cst_p.  You can elide the VECTOR_TYPE_P check then
> and instead do INTEGRAL_TYPE_P (TREE_TYPE (type)).
>

Will do.

> > +   && tree_fits_uhwi_p (rshift_cst)
> > +   && tree_fits_uhwi_p (mult_cst)
> > +   && tree_fits_uhwi_p (bit_and_cst))
> > +   /* Compute what constants would be needed for this to represent a packed
> > +  comparison based on the shift amount denoted by RSHIFT_CST.  */
> > +   (with {
> > + HOST_WIDE_INT vec_elem_bits = vector_element_bits (type);
> > + poly_int64 vec_nelts = TYPE_VECTOR_SUBPARTS (type);
> > + poly_int64 vec_bits = vec_elem_bits * vec_nelts;
> > +
> > + unsigned HOST_WIDE_INT cmp_bits_i, bit_and_i, mult_i;
> > + unsigned HOST_WIDE_INT target_mult_i, target_bit_and_i;
> > + cmp_bits_i = tree_to_uhwi (rshift_cst) + 1;
> > + target_mult_i = (HOST_WIDE_INT_1U << cmp_bits_i) - 1;
> > +
> > + mult_i = tree_to_uhwi (mult_cst);
> > + bit_and_i = tree_to_uhwi (bit_and_cst);
> > + target_bit_and_i = 0;
> > +
> > + /* The bit pattern in BIT_AND_I should be a mask for the least
> > +significant bit of each packed element that is CMP_BITS wide.  */
> > + for (unsigned i = 0; i < vec_elem_bits / cmp_bits_i; i++)
> > +   target_bit_and_i = (target_bit_and_i << cmp_bits_i) | 1U;
> > +}
> > +(if ((exact_log2 (cmp_bits_i)) >= 0
> > +&& cmp_bits_i < HOST_BITS_PER_WIDE_INT
> > +&& multiple_p (vec_bits, cmp_bits_i)
> > +&& vec_elem_bits <= HOST_BITS_PER_WIDE_INT
> > +&& target_mult_i == mult_i
> > +&& target_bit_and_i == bit_and_i)
> > + /* Compute the vector shape for the comparison and check if the 
> > target is
> > +   able to expand the comparison with that type.  */
> > + (with {
> > +

Re: [PATCH] coroutines: Fix promotion of class members in co_await statements [PR99576]

2022-11-30 Thread Iain Sandoe

Hi Adrian,

> On 28 Nov 2022, at 20:44, Iain Sandoe  wrote:

>> Bootstrapping and running the testsuite on x86_64 was successfull. No 
>> regression occured.
> 
> This looks resonable to me, as said in the PR.  I’d like to test a little 
> wider with some larger
> codebases, if you could bear with me for a few days.

So wider testing (in this case folly) reveals that, although the analysis seems 
reasonable, this is not quite the right patch to fix the issue.  It can be that 
CONSTRUCTORS contain nested await expressions, so we cannot simply punt on 
seeing one.

My hunch is that the real solution lies in (correctly) deciding whether to 
promote the temporary or not.  Jason recently made a change that identifies 
whether a target expression is expected to be elided (i.e. it is a direct 
intializer for another object) - I think this might help in this case.  My 
concern is whether I should read “expected to be elided” to be a guarantee 
(“expected” to me could also be read “but it might not be”).

As is, the patch is not OK since it regresses cases with nested await 
expressions in CONSTRUCTORS.
sorry for not spotting that sooner,

Iain

[PATCH] RISC-V: optimize stack manipulation in save-restore

2022-11-30 Thread Fei Gao

The stack that save-restore reserves is not well accumulated in stack 
allocation and deallocation.
This patch allows less instructions to be used in stack allocation and 
deallocation if save-restore enabled,
and also a much clear logic for save-restore stack manipulation.

before patch:
bar:
callt0,__riscv_save_4
addisp,sp,-64
...
li  t0,-12288
addit0,t0,-1968 # optimized out after patch
add sp,sp,t0 # prologue
...
li  t0,12288 # epilogue
addit0,t0,2000 # optimized out after patch
add sp,sp,t0
...
addisp,sp,32
tail__riscv_restore_4

after patch:
bar:
callt0,__riscv_save_4
addisp,sp,-2032
...
li  t0,-12288
add sp,sp,t0 # prologue
...
li  t0,12288 # epilogue
add sp,sp,t0
...
addisp,sp,2032
tail__riscv_restore_4

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_first_stack_step): add a new function 
parameter remaining_size.
(riscv_compute_frame_info): adapt new riscv_first_stack_step interface.
(riscv_expand_prologue): consider save-restore in stack allocation.
(riscv_expand_epilogue): consider save-restore in stack deallocation.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/stack_save_restore.c: New test.
---
 gcc/config/riscv/riscv.cc | 58 ++-
 .../gcc.target/riscv/stack_save_restore.c | 40 +
 2 files changed, 70 insertions(+), 28 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/stack_save_restore.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 05bdba5ab4d..9e92e729a5f 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -4634,7 +4634,7 @@ riscv_save_libcall_count (unsigned mask)
They decrease stack_pointer_rtx but leave frame_pointer_rtx and
hard_frame_pointer_rtx unchanged.  */
 
-static HOST_WIDE_INT riscv_first_stack_step (struct riscv_frame_info *frame);
+static HOST_WIDE_INT riscv_first_stack_step (struct riscv_frame_info *frame, 
poly_int64 remaining_size);
 
 /* Handle stack align for poly_int.  */
 static poly_int64
@@ -4663,7 +4663,7 @@ riscv_compute_frame_info (void)
  save/restore t0.  We check for this before clearing the frame struct.  */
   if (cfun->machine->interrupt_handler_p)
 {
-  HOST_WIDE_INT step1 = riscv_first_stack_step (frame);
+  HOST_WIDE_INT step1 = riscv_first_stack_step (frame, frame->total_size);
   if (! POLY_SMALL_OPERAND_P ((frame->total_size - step1)))
interrupt_save_prologue_temp = true;
 }
@@ -4913,31 +4913,31 @@ riscv_restore_reg (rtx reg, rtx mem)
without adding extra instructions.  */
 
 static HOST_WIDE_INT
-riscv_first_stack_step (struct riscv_frame_info *frame)
+riscv_first_stack_step (struct riscv_frame_info *frame, poly_int64 
remaining_size)
 {
-  HOST_WIDE_INT frame_total_constant_size;
-  if (!frame->total_size.is_constant ())
-frame_total_constant_size
-  = riscv_stack_align (frame->total_size.coeffs[0])
-   - riscv_stack_align (frame->total_size.coeffs[1]);
+  HOST_WIDE_INT remaining_const_size;
+  if (!remaining_size.is_constant ())
+remaining_const_size
+  = riscv_stack_align (remaining_size.coeffs[0])
+   - riscv_stack_align (remaining_size.coeffs[1]);
   else
-frame_total_constant_size = frame->total_size.to_constant ();
+remaining_const_size = remaining_size.to_constant ();
 
-  if (SMALL_OPERAND (frame_total_constant_size))
-return frame_total_constant_size;
+  if (SMALL_OPERAND (remaining_const_size))
+return remaining_const_size;
 
   HOST_WIDE_INT min_first_step =
-RISCV_STACK_ALIGN ((frame->total_size - 
frame->frame_pointer_offset).to_constant());
+RISCV_STACK_ALIGN ((remaining_size - 
frame->frame_pointer_offset).to_constant());
   HOST_WIDE_INT max_first_step = IMM_REACH / 2 - PREFERRED_STACK_BOUNDARY / 8;
-  HOST_WIDE_INT min_second_step = frame_total_constant_size - max_first_step;
+  HOST_WIDE_INT min_second_step = remaining_const_size - max_first_step;
   gcc_assert (min_first_step <= max_first_step);
 
   /* As an optimization, use the least-significant bits of the total frame
  size, so that the second adjustment step is just LUI + ADD.  */
   if (!SMALL_OPERAND (min_second_step)
-  && frame_total_constant_size % IMM_REACH < IMM_REACH / 2
-  && frame_total_constant_size % IMM_REACH >= min_first_step)
-return frame_total_constant_size % IMM_REACH;
+  && remaining_const_size % IMM_REACH < IMM_REACH / 2
+  && remaining_const_size % IMM_REACH >= min_first_step)
+return remaining_const_size % IMM_REACH;

PING^1 [PATCH v2] rs6000: Rework option -mpowerpc64 handling [PR106680]

2022-11-30 Thread Kewen.Lin via Gcc-patches

Hi,

Gentle ping this:

https://gcc.gnu.org/pipermail/gcc-patches/2022-October/603350.html

BR,
Kewen

on 2022/10/12 16:12, Kewen.Lin via Gcc-patches wrote:
> Hi,
> 
> PR106680 shows that -m32 -mpowerpc64 is different from
> -mpowerpc64 -m32, this is determined by the way how we
> handle option powerpc64 in rs6000_handle_option.
> 
> Segher pointed out this difference should be taken as
> a bug and we should ensure that option powerpc64 is
> independent of -m32/-m64.  So this patch removes the
> handlings in rs6000_handle_option and add some necessary
> supports in rs6000_option_override_internal instead.
> 
> With this patch, if users specify -m{no-,}powerpc64, the
> specified value is honoured, otherwise, for 64bit it
> always enables OPTION_MASK_POWERPC64; while for 32bit
> and TARGET_POWERPC64 and OS_MISSING_POWERPC64, it disables
> OPTION_MASK_POWERPC64.
> 
> btw, following Segher's suggestion, I did some tries to warn
> when OPTION_MASK_POWERPC64 is set for OS_MISSING_POWERPC64.
> If warn for the case that powerpc64 is specified explicitly,
> there are some TCs using -m32 -mpowerpc64 on ppc64-linux,
> they need some updates, meanwhile the artificial run
> with "--target_board=unix'{-m32/-mpowerpc64}'" will have
> noisy warnings on ppc64-linux.  If warn for the case that
> it's specified implicitly, they can just be initialized by
> TARGET_DEFAULT (like -m32 on ppc64-linux) or set from the 
> given cpu mask, we have to special case them and not to warn.
> As Segher's latest comment, I decide not to warn them and
> keep it consistent with before.
> 
> Bootstrapped and regress-tested on:
>   - powerpc64-linux-gnu P7 and P8 {-m64,-m32}
>   - powerpc64le-linux-gnu P9 and P10
>   - powerpc-ibm-aix7.2.0.0 {-maix64,-maix32}
> 
> Hi Iain, could you help to test this new patch on darwin
> again?  Thanks in advance!
> 
> Is it ok for trunk if darwin testing goes well?
>

[PATCH v2] predict: Adjust optimize_function_for_size_p [PR105818]

2022-11-30 Thread Kewen.Lin via Gcc-patches

Hi,

Function optimize_function_for_size_p returns OPTIMIZE_SIZE_NO
if fun->decl is not null but no cgraph node is available for it.
As PR105818 shows, this could cause unexpected consequence.  For
the case in PR105818, when parsing bar decl in function foo, the
cfun is the function structure for foo, for which there is no
cgraph node, so it returns OPTIMIZE_SIZE_NO.  But it's incorrect
since the context is to optimize for size, the flag optimize_size
is true.

The patch is to make optimize_function_for_size_p to check
opt_for_fn (fun->decl, optimize_size) further when fun->decl
is available but no cgraph node, it's just like what function
cgraph_node::optimize_for_size_p does at its first step.

One regression failure got exposed on aarch64-linux-gnu:

PASS->FAIL: gcc.dg/guality/pr54693-2.c   -Os \
-DPREVENT_OPTIMIZATION  line 21 x == 10 - i

The difference comes from the macro LOGICAL_OP_NON_SHORT_CIRCUIT
used in function fold_range_test during c parsing, it uses
optimize_function_for_speed_p which is equal to the invertion
of optimize_function_for_size_p.  At that time cfun->decl is valid
but no cgraph node for it, w/o this patch function
optimize_function_for_speed_p returns true eventually, while it
returns false with this patch.  Since the command line option -Os
is specified, there is no reason to interpret it as "for speed".
I think this failure is expected and adjust the test case
accordingly.

v1: https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596628.html

Comparing with v1, v2 adopts opt_for_fn (fun->decl, optimize_size)
instead of optimize_size as Honza's previous comments.

Besides, the reply to Honza's question "Why exactly PR105818 hits
the flag change issue?" was at the link:
https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596667.html

Bootstrapped and regtested on x86_64-redhat-linux,
aarch64-linux-gnu and powerpc64{,le}-linux-gnu.

Is it for trunk?

BR,
Kewen
-
PR middle-end/105818

gcc/ChangeLog:

* predict.cc (optimize_function_for_size_p): Further check
optimize_size of fun->decl when it is valid but no cgraph node.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr105818.c: New test.
* gcc.dg/guality/pr54693-2.c: Adjust for aarch64.
---
 gcc/predict.cc  |  3 ++-
 gcc/testsuite/gcc.dg/guality/pr54693-2.c|  2 +-
 gcc/testsuite/gcc.target/powerpc/pr105818.c | 11 +++
 3 files changed, 14 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr105818.c

diff --git a/gcc/predict.cc b/gcc/predict.cc
index 1bc7ab94454..ecb4aabc9df 100644
--- a/gcc/predict.cc
+++ b/gcc/predict.cc
@@ -268,7 +268,8 @@ optimize_function_for_size_p (struct function *fun)
   cgraph_node *n = cgraph_node::get (fun->decl);
   if (n)
 return n->optimize_for_size_p ();
-  return OPTIMIZE_SIZE_NO;
+  return opt_for_fn (fun->decl, optimize_size) ? OPTIMIZE_SIZE_MAX
+  : OPTIMIZE_SIZE_NO;
 }

 /* Return true if function FUN should always be optimized for speed.  */
diff --git a/gcc/testsuite/gcc.dg/guality/pr54693-2.c 
b/gcc/testsuite/gcc.dg/guality/pr54693-2.c
index 68aa6c63d71..14ca94ad37d 100644
--- a/gcc/testsuite/gcc.dg/guality/pr54693-2.c
+++ b/gcc/testsuite/gcc.dg/guality/pr54693-2.c
@@ -17,7 +17,7 @@ foo (int x, int y, int z)
   int i = 0;
   while (x > 3 && y > 3 && z > 3)
 {  /* { dg-final { gdb-test .+2 "i" "v + 1" } } */
-   /* { dg-final { gdb-test .+1 "x" "10 - i" } } */
+   /* { dg-final { gdb-test .+1 "x" "10 - i" { xfail { 
aarch64*-*-* && { any-opts "-Os" } } } } } */
   bar (i); /* { dg-final { gdb-test . "y" "20 - 2 * i" } } */
/* { dg-final { gdb-test .-1 "z" "30 - 3 * i" { xfail { 
aarch64*-*-* && { any-opts "-fno-fat-lto-objects" "-Os" } } } } } */
   i++, x--, y -= 2, z -= 3;
diff --git a/gcc/testsuite/gcc.target/powerpc/pr105818.c 
b/gcc/testsuite/gcc.target/powerpc/pr105818.c
new file mode 100644
index 000..679647e189d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr105818.c
@@ -0,0 +1,11 @@
+/* { dg-options "-Os -fno-tree-vectorize" } */
+
+/* Verify there is no ICE.  */
+
+#pragma GCC optimize "-fno-tree-vectorize"
+
+void
+foo (void)
+{
+  void bar (void);
+}
--
2.27.0

[PATCH] rs6000: Fix some issues related to Power10 fusion [PR104024]

2022-11-30 Thread Kewen.Lin via Gcc-patches

Hi,

As PR104024 shows, the option -mpower10-fusion isn't guarded by
-mcpu=power10, it causes compiler to fuse for some patterns
even without power10 support and then causes ICE unexpectedly,
this patch is to simply unmask it without power10 support, not
emit any warnings as this option is undocumented.

Besides, for some define_insns in fusion.md which use constraint
v, it requires the condition VECTOR_UNIT_ALTIVEC_OR_VSX_P
(mode), otherwise it can cause ICE in reload, see test
case pr104024-2.c.

Bootstrapped and regtested on powerpc64-linux-gnu P8,
powerpc64le-linux-gnu P9 and P10.

Is it ok for trunk?

BR,
Kewen
-
PR target/104024

gcc/ChangeLog:

* config/rs6000/rs6000.cc (rs6000_option_override_internal): Disable
TARGET_P10_FUSION if !TARGET_POWER10.
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl: Add the check for define_insns
with constraint v.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr104024-1.c: New test.
* gcc.target/powerpc/pr104024-2.c: New test.
---
 gcc/config/rs6000/fusion.md   | 130 +-
 gcc/config/rs6000/genfusion.pl|  12 +-
 gcc/config/rs6000/rs6000.cc   |  11 +-
 gcc/testsuite/gcc.target/powerpc/pr104024-1.c |  16 +++
 gcc/testsuite/gcc.target/powerpc/pr104024-2.c |  18 +++
 5 files changed, 113 insertions(+), 74 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr104024-1.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr104024-2.c

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index 15f0c16f705..c504f65a045 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -1875,7 +1875,7 @@ (define_insn "*fuse_vand_vand"
   (match_operand:VM 1 "altivec_register_operand" 
"%v,v,v,v"))
  (match_operand:VM 2 "altivec_register_operand" "v,v,v,v")))
(clobber (match_scratch:VM 4 "=X,X,X,&v"))]
-  "(TARGET_P10_FUSION)"
+  "(VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode) && TARGET_P10_FUSION)"
   "@
vand %3,%1,%0\;vand %3,%3,%2
vand %3,%1,%0\;vand %3,%3,%2
@@ -1893,7 +1893,7 @@ (define_insn "*fuse_vandc_vand"
   (match_operand:VM 1 "altivec_register_operand" 
"v,v,v,v"))
  (match_operand:VM 2 "altivec_register_operand" "v,v,v,v")))
(clobber (match_scratch:VM 4 "=X,X,X,&v"))]
-  "(TARGET_P10_FUSION)"
+  "(VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode) && TARGET_P10_FUSION)"
   "@
vandc %3,%1,%0\;vand %3,%3,%2
vandc %3,%1,%0\;vand %3,%3,%2
@@ -1911,7 +1911,7 @@ (define_insn "*fuse_veqv_vand"
   (match_operand:VM 1 "altivec_register_operand" 
"v,v,v,v")))
  (match_operand:VM 2 "altivec_register_operand" "v,v,v,v")))
(clobber (match_scratch:VM 4 "=X,X,X,&v"))]
-  "(TARGET_P10_FUSION)"
+  "(VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode) && TARGET_P10_FUSION)"
   "@
veqv %3,%1,%0\;vand %3,%3,%2
veqv %3,%1,%0\;vand %3,%3,%2
@@ -1929,7 +1929,7 @@ (define_insn "*fuse_vnand_vand"
   (not:VM (match_operand:VM 1 
"altivec_register_operand" "v,v,v,v")))
  (match_operand:VM 2 "altivec_register_operand" "v,v,v,v")))
(clobber (match_scratch:VM 4 "=X,X,X,&v"))]
-  "(TARGET_P10_FUSION)"
+  "(VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode) && TARGET_P10_FUSION)"
   "@
vnand %3,%1,%0\;vand %3,%3,%2
vnand %3,%1,%0\;vand %3,%3,%2
@@ -1947,7 +1947,7 @@ (define_insn "*fuse_vnor_vand"
   (not:VM (match_operand:VM 1 
"altivec_register_operand" "v,v,v,v")))
  (match_operand:VM 2 "altivec_register_operand" "v,v,v,v")))
(clobber (match_scratch:VM 4 "=X,X,X,&v"))]
-  "(TARGET_P10_FUSION)"
+  "(VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode) && TARGET_P10_FUSION)"
   "@
vnor %3,%1,%0\;vand %3,%3,%2
vnor %3,%1,%0\;vand %3,%3,%2
@@ -1965,7 +1965,7 @@ (define_insn "*fuse_vor_vand"
   (match_operand:VM 1 "altivec_register_operand" 
"v,v,v,v"))
  (match_operand:VM 2 "altivec_register_operand" "v,v,v,v")))
(clobber (match_scratch:VM 4 "=X,X,X,&v"))]
-  "(TARGET_P10_FUSION)"
+  "(VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode) && TARGET_P10_FUSION)"
   "@
vor %3,%1,%0\;vand %3,%3,%2
vor %3,%1,%0\;vand %3,%3,%2
@@ -1983,7 +1983,7 @@ (define_insn "*fuse_vorc_vand"
   (match_operand:VM 1 "altivec_register_operand" 
"v,v,v,v"))
  (match_operand:VM 2 "altivec_register_operand" "v,v,v,v")))
(clobber (match_scratch:VM 4 "=X,X,X,&v"))]
-  "(TARGET_P10_FUSION)"
+  "(VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode) && TARGET_P10_FUSION)"
   "@
vorc %3,%1,%0\;vand %3,%3,%2
vorc %3,%1,%0\;vand %3,%3,%2
@@ -2001,7 +2001,7 @@ (define_insn "*fuse_vxor_vand"
   (match_operand:VM 1 "altivec_register_operand" 
"v,v,v,v"))
  (match_operand:VM 2 "altivec_register_operand" "v,v,v,v")))
(clobber (match_scratch:VM 4

[PR107304] note test's ifunc requirement

2022-11-30 Thread Alexandre Oliva via Gcc-patches



The test uses target_clones, that requires ifunc support.

Tested on GNU/Linux/x86_64, Solaris/x86 (thanks Rainer), and with a
cross to x86_64-elf.  Approved by Rainer in the PR.  I'm checking it in,
trunk only for now.


for  gcc/testsuite/ChangeLog

PR target/107304
* gcc.target/i386/pr107304.c: dg-require ifunc support.
---
 gcc/testsuite/gcc.target/i386/pr107304.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/testsuite/gcc.target/i386/pr107304.c 
b/gcc/testsuite/gcc.target/i386/pr107304.c
index 24d68795e7f1c..0043b7b21a32f 100644
--- a/gcc/testsuite/gcc.target/i386/pr107304.c
+++ b/gcc/testsuite/gcc.target/i386/pr107304.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-O0 -march=tigerlake" } */
+/* { dg-require-ifunc "" } */
 
 #include 
 

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about

90 matches

Mail list logo