[Bug modula2/112946] Assignment of string to enumeration or set crashes

2023-12-15 Thread gaius at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112946

Gaius Mulley  changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|FIXED   |---

--- Comment #6 from Gaius Mulley  ---
MODULE badexpression3 ;

TYPE
   enums = (red, blue, green) ;
   set = SET OF enums ;
VAR
   setvar : set;
   enumvar: enums;
BEGIN
   setvar := set {red, blue} ;
   setvar := setvar + green ;(* Should detect an error here.  *)
   IF NOT (green IN setvar)
   THEN
  HALT
   END
END badexpression3.

successfully compiles - and it should detect an error at "setvar + green".
M2GenGCC.mod:FoldBinary (and FoldUnary) need the same treatment as FoldBecomes
in the original patch.

Re: Re: [PATCH] RISC-V: Add Zvfbfmin extension to the -march= option

2023-12-15 Thread Xiao Zeng
2023-12-16 03:27  Jeff Law  wrote:
>
 
>
>
>On 12/12/23 20:24, Xiao Zeng wrote:
>> This patch would like to add new sub extension (aka Zvfbfmin) to the
>> -march= option. It introduces a new data type BF16.
>>
>> Depending on different usage scenarios, the Zvfbfmin extension may
>> depend on 'V' or 'Zve32f'. This patch only implements dependencies
>> in scenario of Embedded Processor. In scenario of Application
>> Processor, it is necessary to explicitly indicate the dependent
>> 'V' extension.
>>
>> You can locate more information about Zvfbfmin from below spec doc.
>>
>> https://github.com/riscv/riscv-bfloat16/releases/download/20231027/riscv-bfloat16.pdf
>>
>> gcc/ChangeLog:
>>
>> * common/config/riscv/riscv-common.cc:
>> (riscv_implied_info): Add zvfbfmin item.
>>  (riscv_ext_version_table): Ditto.
>>  (riscv_ext_flag_table): Ditto.
>> * config/riscv/riscv.opt:
>> (MASK_ZVFBFMIN): New macro.
>> (MASK_VECTOR_ELEN_BF_16): Ditto.
>> (TARGET_ZVFBFMIN): Ditto.
>>
>> gcc/testsuite/ChangeLog:
>>
>> * gcc.target/riscv/arch-31.c: New test.
>> * gcc.target/riscv/arch-32.c: New test.
>> * gcc.target/riscv/predef-32.c: New test.
>> * gcc.target/riscv/predef-33.c: New test.
>I fixed the trivial whitespace issue with the ChangeLog and pushed this
>to the trunk. 
Thank you, Jeff. I will pay attention to these issues in the future patches.

>However, I do want to stress that all future
>contributions need to indicate that the patch was successfully
>regression tested. 
Similarly, this should also be indicated.

>
>jeff
 
Thanks
Xiao Zeng



[Bug target/113042] popcount of 8bits and 128bits can be improved for !TARGET_CSSC

2023-12-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113042

--- Comment #2 from Andrew Pinski  ---
Blah, adding popcountti2 does not work as the middle-end (in
fold_builtin_bit_query) splits it into 2 __builtin_popcountll already ...

[Bug target/103781] generic/cortex-a53 cost model for SLP for aarch64 is good

2023-12-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103781

Andrew Pinski  changed:

   What|Removed |Added

 CC||pinskia at gcc dot gnu.org

--- Comment #5 from Andrew Pinski  ---
I know that the generic cost model has changed on the trunk but I am not sure
this one is fixed ...

[Bug target/113043] New: ICE: in emit_move_insn, at expr.cc:4246

2023-12-15 Thread iamanonymous.cs at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113043

Bug ID: 113043
   Summary: ICE: in emit_move_insn, at expr.cc:4246
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: iamanonymous.cs at gmail dot com
  Target Milestone: ---

Compiler Explorer: https://godbolt.org/z/erY5Mrrso

***
OS and Platform:
$ uname -a:
Linux ubuntu 4.15.0-213-generic #224-Ubuntu SMP Mon Jun 19 13:30:12 UTC 2023
x86_64 x86_64 x86_64 GNU/Linux
***
gcc version:
$ gcc -v
Using built-in specs.
COLLECT_GCC=/root/gcc_set/202311291030/bin/gcc
COLLECT_LTO_WRAPPER=/root/gcc_set/202311291030/libexec/gcc/x86_64-pc-linux-gnu/14.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc/configure --prefix=/root/gcc_set/202311291030
--with-gmp=/root/build_essential --with-mpfr=/root/build_essential
--with-mpc=/root/build_essential --enable-languages=c,c++ --disable-multilib
--with-sanitizer=address,undefined,thread,leak
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 14.0.0 20231129 (experimental) (GCC) 

git version: 99fa0bfd63d97825c4221dcd3123940f1d0e6291
***
Program:
$ cat mutant.c
struct a {
  long b
};
__attribute__((interrupt)) void c(struct a *d) { 1 != d->b; }

***
Command Lines:
$ gcc -mx32 -mgeneral-regs-only -maddress-mode=long -fsanitize=undefined
mutant.c
mutant.c:3:1: warning: no semicolon at end of struct or union
3 | };
  | ^
during RTL pass: expand
mutant.c: In function ‘c’:
mutant.c:4:56: internal compiler error: in emit_move_insn, at expr.cc:4246
4 | __attribute__((interrupt)) void c(struct a *d) { 1 != d->b; }
  |   ~^~~
0x774d50 emit_move_insn(rtx_def*, rtx_def*)
../../gcc/gcc/expr.cc:4246
0xaddfec expand_gimple_stmt_1
../../gcc/gcc/cfgexpand.cc:4011
0xaddfec expand_gimple_stmt
../../gcc/gcc/cfgexpand.cc:4045
0xadec97 expand_gimple_basic_block
../../gcc/gcc/cfgexpand.cc:6101
0xae08f6 execute
../../gcc/gcc/cfgexpand.cc:6836
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.

[Bug tree-optimization/96461] [SVE] Use the HISTCNT instruction for simple histogram loops

2023-12-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96461

Andrew Pinski  changed:

   What|Removed |Added

 CC||pinskia at gcc dot gnu.org
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-12-16
 Ever confirmed|0   |1

--- Comment #1 from Andrew Pinski  ---
Confirmed.

[Bug c++/113031] [14 Regression] ICE in cxx_fold_indirect_ref_1 starting with r14-6508

2023-12-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113031

--- Comment #4 from GCC Commits  ---
The master branch has been updated by Nathaniel Shead :

https://gcc.gnu.org/g:39f9c426f58448d6df340cdccd84e05721a20921

commit r14-6619-g39f9c426f58448d6df340cdccd84e05721a20921
Author: Nathaniel Shead 
Date:   Sat Dec 16 10:59:03 2023 +1100

c++: Fix unchecked use of CLASSTYPE_AS_BASE [PR113031]

My previous commit (naively) assumed that a TREE_CODE of RECORD_TYPE or
UNION_TYPE was sufficient for optype to be considered a "class type".
However, this does not account for e.g. template type parameters of
record or union type. This patch corrects to check for CLASS_TYPE_P
before checking for as-base conversion.

PR c++/113031

gcc/cp/ChangeLog:

* constexpr.cc (cxx_fold_indirect_ref_1): Check for CLASS_TYPE
before using CLASSTYPE_AS_BASE.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/pr113031.C: New test.

Signed-off-by: Nathaniel Shead 

[Bug target/113034] Miscompilation of __m128 ne comparison on LoongArch

2023-12-15 Thread c at jia dot je via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113034

--- Comment #1 from Jiajie Chen  ---
As pointed out by @lrzlin, the expected output should be:

```
1 -1 0
```

[Bug target/113042] popcount of 8bits and 128bits can be improved for !TARGET_CSSC

2023-12-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113042

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2023-12-16
 Status|UNCONFIRMED |ASSIGNED

--- Comment #1 from Andrew Pinski  ---
Will submit the patch early next week.

[Bug target/113042] New: popcount of 8bits and 128bits can be improved for !TARGET_CSSC

2023-12-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113042

Bug ID: 113042
   Summary: popcount of 8bits and 128bits can be improved for
!TARGET_CSSC
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: enhancement
  Priority: P3
 Component: target
  Assignee: pinskia at gcc dot gnu.org
  Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---
Target: aarch64

Take:
```
unsigned h8 (const unsigned char *restrict a) {
  return __builtin_popcountg (a[0]);
}


unsigned __int128 h128 (const unsigned __int128 *restrict a) {
  return __builtin_popcountg (a[0]);
}

```

Currently h8 produces:
```
ldr b31, [x0]
cnt v31.8b, v31.8b
addvb31, v31.8b
fmovw0, s31
ret
```
But the addv is not needed here and we could instead just get:
```
ldr b31, [x0]
cnt v31.8b, v31.8b
smovw0, v31.b[0]
ret
```

For h128, there are two cnt:
```
ldp d30, d31, [x0]
mov x1, 0
cnt v30.8b, v30.8b
cnt v31.8b, v31.8b
addvb30, v30.8b
addvb31, v31.8b
fmovx2, d30
fmovx0, d31
add x0, x2, x0
ret
```

But we could do instead:
```
ldr q30, [x0]
mov x1, 0
cnt v30.16b, v30.16b
addvb30, v31.16b
fmovx0, d30
ret
```

Basically we need to implement popcountqi2 and popcountti2 patterns.

Note for TARGET_CSSC, Using the scalar cnt will still be better I suspect so I
won't enable these patterns for that.

Re: [committed v4 5/5] aarch64: Add function multiversioning support

2023-12-15 Thread Ramana Radhakrishnan
On Sat, Dec 16, 2023 at 6:18 AM Andrew Carlotti  wrote:
>
> This adds initial support for function multiversioning on aarch64 using
> the target_version and target_clones attributes.  This loosely follows
> the Beta specification in the ACLE [1], although with some differences
> that still need to be resolved (possibly as follow-up patches).
>
> Existing function multiversioning implementations are broken in various
> ways when used across translation units.  This includes placing
> resolvers in the wrong translation units, and using symbol mangling that
> callers to unintentionally bypass the resolver in some circumstances.
> Fixing these issues for aarch64 will require modifications to our ACLE
> specification.  It will also require further adjustments to existing
> middle end code, to facilitate different mangling and resolver
> placement while preserving existing target behaviours.
>
> The list of function multiversioning features specified in the ACLE is
> also inconsistent with the list of features supported in target option
> extensions.  I intend to resolve some or all of these inconsistencies at
> a later stage.
>
> The target_version attribute is currently only supported in C++, since
> this is the only frontend with existing support for multiversioning
> using the target attribute.  On the other hand, this patch happens to
> enable multiversioning with the target_clones attribute in Ada and D, as
> well as the entire C family, using their existing frontend support.
>
> This patch also does not support the following aspects of the Beta
> specification:
>
> - The target_clones attribute should allow an implicit unlisted
>   "default" version.
> - There should be an option to disable function multiversioning at
>   compile time.
> - Unrecognised target names in a target_clones attribute should be
>   ignored (with an optional warning).  This current patch raises an
>   error instead.
>
> [1] 
> https://github.com/ARM-software/acle/blob/main/main/acle.md#function-multi-versioning
>
> Committed as approved with the coding convention fix, plus some adjustments to
> aarch64-option-extensions.def to accommodate recent changes on master. The
> series passed regression testing as a whole post-rebase on aarch64.

Pretty neat, very nice to see this work land - I would consider this
for the NEWS page for GCC-14.

Ramana

>
> gcc/ChangeLog:
>
> * config/aarch64/aarch64-feature-deps.h (fmv_deps_):
> Define aarch64_feature_flags mask foreach FMV feature.
> * config/aarch64/aarch64-option-extensions.def: Use new macros
> to define FMV feature extensions.
> * config/aarch64/aarch64.cc (aarch64_option_valid_attribute_p):
> Check for target_version attribute after processing target
> attribute.
> (aarch64_fmv_feature_data): New.
> (aarch64_parse_fmv_features): New.
> (aarch64_process_target_version_attr): New.
> (aarch64_option_valid_version_attribute_p): New.
> (get_feature_mask_for_version): New.
> (compare_feature_masks): New.
> (aarch64_compare_version_priority): New.
> (build_ifunc_arg_type): New.
> (make_resolver_func): New.
> (add_condition_to_bb): New.
> (dispatch_function_versions): New.
> (aarch64_generate_version_dispatcher_body): New.
> (aarch64_get_function_versions_dispatcher): New.
> (aarch64_common_function_versions): New.
> (aarch64_mangle_decl_assembler_name): New.
> (TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P): New implementation.
> (TARGET_OPTION_EXPANDED_CLONES_ATTRIBUTE): New implementation.
> (TARGET_OPTION_FUNCTION_VERSIONS): New implementation.
> (TARGET_COMPARE_VERSION_PRIORITY): New implementation.
> (TARGET_GENERATE_VERSION_DISPATCHER_BODY): New implementation.
> (TARGET_GET_FUNCTION_VERSIONS_DISPATCHER): New implementation.
> (TARGET_MANGLE_DECL_ASSEMBLER_NAME): New implementation.
> * config/aarch64/aarch64.h (TARGET_HAS_FMV_TARGET_ATTRIBUTE):
> Set target macro.
> * config/arm/aarch-common.h (enum aarch_parse_opt_result): Add
> new value to report duplicate FMV feature.
> * common/config/aarch64/cpuinfo.h: New file.
>
> libgcc/ChangeLog:
>
> * config/aarch64/cpuinfo.c (enum CPUFeatures): Move to shared
> copy in gcc/common
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/aarch64/options_set_17.c: Reorder expected flags.
> * gcc.target/aarch64/cpunative/native_cpu_0.c: Ditto.
> * gcc.target/aarch64/cpunative/native_cpu_13.c: Ditto.
> * gcc.target/aarch64/cpunative/native_cpu_16.c: Ditto.
> * gcc.target/aarch64/cpunative/native_cpu_17.c: Ditto.
> * gcc.target/aarch64/cpunative/native_cpu_18.c: Ditto.
> * gcc.target/aarch64/cpunative/native_cpu_19.c: Ditto.
> * gcc.target/aarch64/cpunative/native_cpu_20.c: Ditto.
> * 

Re: [PATCH] c++: Fix unchecked use of CLASSTYPE_AS_BASE [PR113031]

2023-12-15 Thread Jason Merrill

On 12/15/23 19:20, Nathaniel Shead wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu with GLIBCXX_TESTSUITE_STDS=20
and RUNTESTFLAGS="--target_board=unix/-D_GLIBCXX_USE_CXX11_ABI=0".


OK, thanks.


-- >8 --

My previous patch (naively) assumed that a TREE_CODE of RECORD_TYPE or
UNION_TYPE was sufficient for optype to be considered a "class type".
However, this does not account for e.g. template type parameters of
record or union type. This patch corrects to check for CLASS_TYPE_P
before checking for as-base conversion.

PR c++/113031

gcc/cp/ChangeLog:

* constexpr.cc (cxx_fold_indirect_ref_1): Check for CLASS_TYPE
before using CLASSTYPE_AS_BASE.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/pr113031.C: New test.

Signed-off-by: Nathaniel Shead 
---
  gcc/cp/constexpr.cc   |  3 ++-
  gcc/testsuite/g++.dg/cpp0x/pr113031.C | 34 +++
  2 files changed, 36 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/pr113031.C

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index e1b2d27fc36..051f73fb73f 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -5709,7 +5709,8 @@ cxx_fold_indirect_ref_1 (const constexpr_ctx *ctx, 
location_t loc, tree type,
  }
  
/* Handle conversion to "as base" type.  */

-  if (CLASSTYPE_AS_BASE (optype) == type)
+  if (CLASS_TYPE_P (optype)
+ && CLASSTYPE_AS_BASE (optype) == type)
return op;
  
/* Handle conversion to an empty base class, which is represented with a

diff --git a/gcc/testsuite/g++.dg/cpp0x/pr113031.C 
b/gcc/testsuite/g++.dg/cpp0x/pr113031.C
new file mode 100644
index 000..aecdc3fc4b2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/pr113031.C
@@ -0,0 +1,34 @@
+// PR c++/113031
+// { dg-do compile }
+
+template  struct variant;
+
+template 
+variant<_Types> __variant_cast(_Tp __rhs) { return 
static_cast&>(__rhs); }
+
+template 
+struct _Move_assign_base : _Types {
+  void operator=(_Move_assign_base __rhs) { __variant_cast<_Types>(__rhs); }
+};
+
+template 
+struct variant : _Move_assign_base<_Types> {
+  void emplace() {
+variant __tmp;
+*this = __tmp;
+  }
+};
+
+struct _Undefined_class {
+  struct _Nocopy_types {
+void (_Undefined_class::*_M_member_pointer)();
+  };
+  struct function : _Nocopy_types {
+struct optional {
+  void test03() {
+variant v;
+v.emplace();
+  }
+};
+  };
+};




[Bug c++/113041] New: misleading diagnostic for variable of non-literal type in constexpr function in C++20 mode

2023-12-15 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113041

Bug ID: 113041
   Summary: misleading diagnostic for variable of non-literal type
in constexpr function in C++20 mode
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ppalka at gcc dot gnu.org
  Target Milestone: ---

Here f() is not constexpr in C++20 (but is in C++23 after P2242R3) due to
the local variable a of non-literal type A, so the testcase is invalid:

struct A { ~A(); };

void g();

template
constexpr int f() {
  if (__builtin_is_constant_evaluated())
return 42;
  g();
  A a;
}

constexpr int n = f();

But our diagnostic incorrectly blames the call to the non-constexpr g() for
f() being non-constexpr, rather than the variable a:

constexpr-nonliteral.C:13:25: error: ‘constexpr int f() [with T = int]’ called
in a constant expression
   13 | constexpr int n = f();
  |   ~~^~
constexpr-nonliteral.C:6:15: note: ‘constexpr int f() [with T = int]’ is not
usable as a ‘constexpr’ function because:
6 | constexpr int f() {
  |   ^
constexpr-nonliteral.C:9:4: error: call to non-‘constexpr’ function ‘void g()’
9 |   g();
  |   ~^~
constexpr-nonliteral.C:3:6: note: ‘void g()’ declared here
3 | void g();
  |  ^

[Bug tree-optimization/53947] [meta-bug] vectorizer missed-optimizations

2023-12-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
Bug 53947 depends on bug 98792, which changed state.

Bug 98792 Summary: Fail to use SHRN instructions for narrowing shift on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98792

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug target/98792] Fail to use SHRN instructions for narrowing shift on aarch64

2023-12-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98792

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Andrew Pinski  ---
It was fixed in GCC 12 by one of the following commits:
r12-7142-g83d7e720cd1d07
r12-7141-gbce43c0493f65d
r12-7140-g4057266ce5afc1
r12-7138-gaeef5c57f161ad

[Bug target/113039] [14 Regression] -fcf-protection -fcf-protection=branch doesn't work

2023-12-15 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113039

--- Comment #2 from H.J. Lu  ---
(In reply to H.J. Lu from comment #1)
> This may be caused by r14-2692-g1c6231c05bdcca

This commit changes the behavior of -fcf-protection -fcf-protection=branch.
Workaround is to use -fcf-protection -fcf-protection=none
-fcf-protection=branch.

[Bug target/113039] [14 Regression] -fcf-protection -fcf-protection=branch doesn't work

2023-12-15 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113039

H.J. Lu  changed:

   What|Removed |Added

   Last reconfirmed||2023-12-16
 Ever confirmed|0   |1
   Target Milestone|--- |14.0
 Status|UNCONFIRMED |NEW

--- Comment #1 from H.J. Lu  ---
This may be caused by r14-2692-g1c6231c05bdcca

[Bug target/113040] New: [14 Regression] libmvec test failures

2023-12-15 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113040

Bug ID: 113040
   Summary: [14 Regression] libmvec test failures
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hjl.tools at gmail dot com
CC: crazylht at gmail dot com
  Target Milestone: ---
Target: x86-64

When GCC 14 (r14-6607-g082835836cf763) to build and test glibc on x86-64,
I got

FAIL: math/test-float-libmvec-sincosf-avx2
FAIL: math/test-float-libmvec-sincosf-avx512f

Program received signal SIGSEGV, Segmentation fault.
_ZGVdN8vvv_sincosf_avx2 ()
at ../sysdeps/x86_64/fpu/multiarch/svml_s_sincosf8_core_avx2.S:305
305 movl  %r9d, (%r10)
(gdb)

and

Program received signal SIGSEGV, Segmentation fault.
0x5547e976 in _ZGVeN16vvv_sincosf_skx ()
at ../sysdeps/x86_64/fpu/multiarch/svml_s_sincosf16_core_avx512.S:741
741 WRAPPER_AVX512_vvv_vl4l4 _ZGVeN16vl4l4_sincosf_skx
(gdb)

[Bug target/113039] New: [14 Regression] -fcf-protection -fcf-protection=branch doesn't work

2023-12-15 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113039

Bug ID: 113039
   Summary: [14 Regression] -fcf-protection -fcf-protection=branch
doesn't work
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hjl.tools at gmail dot com
CC: crazylht at gmail dot com
  Target Milestone: ---
Target: x86-64

[hjl@gnu-cfl-3 pr85334]$ touch x.c
[hjl@gnu-cfl-3 pr85334]$
/export/build/gnu/tools-build/gcc-gitlab-debug/release/usr/gcc-14.0.0-x86-64/bin/gcc
-c -fcf-protection -fcf-protection=branch x.c
[hjl@gnu-cfl-3 pr85334]$ readelf -n x.o

Displaying notes found in: .note.gnu.property
  OwnerData sizeDescription
  GNU  0x0010   NT_GNU_PROPERTY_TYPE_0
  Properties: x86 feature: IBT, SHSTK
  GNU  0x0020   NT_GNU_PROPERTY_TYPE_0
  Properties: x86 ISA used: 
x86 feature used: x86
[hjl@gnu-cfl-3 pr85334]$ 

-fcf-protection=branch should override -fcf-protection.

[committed v4 5/5] aarch64: Add function multiversioning support

2023-12-15 Thread Andrew Carlotti
This adds initial support for function multiversioning on aarch64 using
the target_version and target_clones attributes.  This loosely follows
the Beta specification in the ACLE [1], although with some differences
that still need to be resolved (possibly as follow-up patches).

Existing function multiversioning implementations are broken in various
ways when used across translation units.  This includes placing
resolvers in the wrong translation units, and using symbol mangling that
callers to unintentionally bypass the resolver in some circumstances.
Fixing these issues for aarch64 will require modifications to our ACLE
specification.  It will also require further adjustments to existing
middle end code, to facilitate different mangling and resolver
placement while preserving existing target behaviours.

The list of function multiversioning features specified in the ACLE is
also inconsistent with the list of features supported in target option
extensions.  I intend to resolve some or all of these inconsistencies at
a later stage.

The target_version attribute is currently only supported in C++, since
this is the only frontend with existing support for multiversioning
using the target attribute.  On the other hand, this patch happens to
enable multiversioning with the target_clones attribute in Ada and D, as
well as the entire C family, using their existing frontend support.

This patch also does not support the following aspects of the Beta
specification:

- The target_clones attribute should allow an implicit unlisted
  "default" version.
- There should be an option to disable function multiversioning at
  compile time.
- Unrecognised target names in a target_clones attribute should be
  ignored (with an optional warning).  This current patch raises an
  error instead.

[1] 
https://github.com/ARM-software/acle/blob/main/main/acle.md#function-multi-versioning

Committed as approved with the coding convention fix, plus some adjustments to
aarch64-option-extensions.def to accommodate recent changes on master. The
series passed regression testing as a whole post-rebase on aarch64.

gcc/ChangeLog:

* config/aarch64/aarch64-feature-deps.h (fmv_deps_):
Define aarch64_feature_flags mask foreach FMV feature.
* config/aarch64/aarch64-option-extensions.def: Use new macros
to define FMV feature extensions.
* config/aarch64/aarch64.cc (aarch64_option_valid_attribute_p):
Check for target_version attribute after processing target
attribute.
(aarch64_fmv_feature_data): New.
(aarch64_parse_fmv_features): New.
(aarch64_process_target_version_attr): New.
(aarch64_option_valid_version_attribute_p): New.
(get_feature_mask_for_version): New.
(compare_feature_masks): New.
(aarch64_compare_version_priority): New.
(build_ifunc_arg_type): New.
(make_resolver_func): New.
(add_condition_to_bb): New.
(dispatch_function_versions): New.
(aarch64_generate_version_dispatcher_body): New.
(aarch64_get_function_versions_dispatcher): New.
(aarch64_common_function_versions): New.
(aarch64_mangle_decl_assembler_name): New.
(TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P): New implementation.
(TARGET_OPTION_EXPANDED_CLONES_ATTRIBUTE): New implementation.
(TARGET_OPTION_FUNCTION_VERSIONS): New implementation.
(TARGET_COMPARE_VERSION_PRIORITY): New implementation.
(TARGET_GENERATE_VERSION_DISPATCHER_BODY): New implementation.
(TARGET_GET_FUNCTION_VERSIONS_DISPATCHER): New implementation.
(TARGET_MANGLE_DECL_ASSEMBLER_NAME): New implementation.
* config/aarch64/aarch64.h (TARGET_HAS_FMV_TARGET_ATTRIBUTE):
Set target macro.
* config/arm/aarch-common.h (enum aarch_parse_opt_result): Add
new value to report duplicate FMV feature.
* common/config/aarch64/cpuinfo.h: New file.

libgcc/ChangeLog:

* config/aarch64/cpuinfo.c (enum CPUFeatures): Move to shared
copy in gcc/common

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/options_set_17.c: Reorder expected flags.
* gcc.target/aarch64/cpunative/native_cpu_0.c: Ditto.
* gcc.target/aarch64/cpunative/native_cpu_13.c: Ditto.
* gcc.target/aarch64/cpunative/native_cpu_16.c: Ditto.
* gcc.target/aarch64/cpunative/native_cpu_17.c: Ditto.
* gcc.target/aarch64/cpunative/native_cpu_18.c: Ditto.
* gcc.target/aarch64/cpunative/native_cpu_19.c: Ditto.
* gcc.target/aarch64/cpunative/native_cpu_20.c: Ditto.
* gcc.target/aarch64/cpunative/native_cpu_21.c: Ditto.
* gcc.target/aarch64/cpunative/native_cpu_22.c: Ditto.
* gcc.target/aarch64/cpunative/native_cpu_6.c: Ditto.
* gcc.target/aarch64/cpunative/native_cpu_7.c: Ditto.


diff --git a/gcc/common/config/aarch64/cpuinfo.h 
b/gcc/common/config/aarch64/cpuinfo.h
new file mode 100644
index 

[committed v4 4/5] Add support for target_version attribute

2023-12-15 Thread Andrew Carlotti
This patch adds support for the "target_version" attribute to the middle
end and the C++ frontend, which will be used to implement function
multiversioning in the aarch64 backend.

On targets that don't use the "target" attribute for multiversioning,
there is no conflict between the "target" and "target_clones"
attributes.  This patch therefore makes the mutual exclusion in
C-family, D and Ada conditonal upon the value of the
expanded_clones_attribute target hook.

The "target_version" attribute is only added to C++ in this patch,
because this is currently the only frontend which supports
multiversioning using the "target" attribute.  Support for the
"target_version" attribute will be extended to C at a later date.

Targets that currently use the "target" attribute for function
multiversioning (i.e. i386 and rs6000) are not affected by this patch.

Committed as approved with adjustments to comments in c-attribs.

gcc/ChangeLog:

* attribs.cc (decl_attributes): Pass attribute name to target.
(is_function_default_version): Update comment to specify
incompatibility with target_version attributes.
* cgraphclones.cc (cgraph_node::create_version_clone_with_body):
Call valid_version_attribute_p for target_version attributes.
* defaults.h (TARGET_HAS_FMV_TARGET_ATTRIBUTE): New macro.
* target.def (valid_version_attribute_p): New hook.
* doc/tm.texi.in: Add new hook.
* doc/tm.texi: Regenerate.
* multiple_target.cc (create_dispatcher_calls): Remove redundant
is_function_default_version check.
(expand_target_clones): Use target macro to pick attribute name.
* targhooks.cc (default_target_option_valid_version_attribute_p):
New.
* targhooks.h (default_target_option_valid_version_attribute_p):
New.
* tree.h (DECL_FUNCTION_VERSIONED): Update comment to include
target_version attributes.

gcc/c-family/ChangeLog:

* c-attribs.cc (attr_target_exclusions): Make
target/target_clones exclusion target-dependent.
(attr_target_clones_exclusions): Ditto, and add target_version.
(attr_target_version_exclusions): New.
(c_common_attribute_table): Add target_version.
(handle_target_version_attribute): New.
(handle_target_attribute): Amend comment.
(handle_target_clones_attribute): Ditto.

gcc/ada/ChangeLog:

* gcc-interface/utils.cc (attr_target_exclusions): Make
target/target_clones exclusion target-dependent.
(attr_target_clones_exclusions): Ditto.

gcc/d/ChangeLog:

* d-attribs.cc (attr_target_exclusions): Make
target/target_clones exclusion target-dependent.
(attr_target_clones_exclusions): Ditto.

gcc/cp/ChangeLog:

* decl2.cc (check_classfn): Update comment to include
target_version attributes.


diff --git a/gcc/ada/gcc-interface/utils.cc b/gcc/ada/gcc-interface/utils.cc
index 
3eabbec6bd34116910a0589b4ebf269b916cc607..17f6afd687d1dbd7648d52d86417414b04c0d896
 100644
--- a/gcc/ada/gcc-interface/utils.cc
+++ b/gcc/ada/gcc-interface/utils.cc
@@ -146,14 +146,16 @@ static const struct attribute_spec::exclusions 
attr_noinline_exclusions[] =
 
 static const struct attribute_spec::exclusions attr_target_exclusions[] =
 {
-  { "target_clones", true, true, true },
+  { "target_clones", TARGET_HAS_FMV_TARGET_ATTRIBUTE,
+TARGET_HAS_FMV_TARGET_ATTRIBUTE, TARGET_HAS_FMV_TARGET_ATTRIBUTE },
   { NULL, false, false, false },
 };
 
 static const struct attribute_spec::exclusions attr_target_clones_exclusions[] 
=
 {
   { "always_inline", true, true, true },
-  { "target", true, true, true },
+  { "target", TARGET_HAS_FMV_TARGET_ATTRIBUTE, TARGET_HAS_FMV_TARGET_ATTRIBUTE,
+TARGET_HAS_FMV_TARGET_ATTRIBUTE },
   { NULL, false, false, false },
 };
 
diff --git a/gcc/attribs.cc b/gcc/attribs.cc
index 
4e313d38f0f0608991c3267f55f43e3f0dd9d74a..0ca2779788569b7a02a79eab4db558df112aff87
 100644
--- a/gcc/attribs.cc
+++ b/gcc/attribs.cc
@@ -675,7 +675,8 @@ decl_attributes (tree *node, tree attributes, int flags,
  options to the attribute((target(...))) list.  */
   if (TREE_CODE (*node) == FUNCTION_DECL
   && current_target_pragma
-  && targetm.target_option.valid_attribute_p (*node, NULL_TREE,
+  && targetm.target_option.valid_attribute_p (*node,
+ get_identifier ("target"),
  current_target_pragma, 0))
 {
   tree cur_attr = lookup_attribute ("target", attributes);
@@ -1276,8 +1277,9 @@ make_dispatcher_decl (const tree decl)
   return func_decl;  
 }
 
-/* Returns true if decl is multi-versioned and DECL is the default function,
-   that is it is not tagged with target specific optimization.  */
+/* Returns true if DECL is multi-versioned using the target attribute, and this
+   is the default version.  This function can only be used for targets that do
+   

[Bug fortran/112873] F2023 degree trig functions

2023-12-15 Thread jvdelisle at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112873

Jerry DeLisle  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #34 from Jerry DeLisle  ---
If not resolved, feel free to reopen.

[Bug sanitizer/112727] [11/12/13 Regression] UBSAN creates GIMPLE path with uninitialized variable

2023-12-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112727

--- Comment #11 from GCC Commits  ---
The releases/gcc-12 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:2b8f9636c1b19fff7723995f2d58d41f3d30c46d

commit r12-10056-g2b8f9636c1b19fff7723995f2d58d41f3d30c46d
Author: Jakub Jelinek 
Date:   Fri Dec 8 20:56:48 2023 +0100

c++: Unshare folded SAVE_EXPR arguments during cp_fold [PR112727]

The following testcase is miscompiled because two ubsan instrumentations
run into each other.
The first one is the shift instrumentation.  Before the C++ FE calls
it, it wraps the 2 shift arguments with cp_save_expr, so that side-effects
in them aren't evaluated multiple times.  And, ubsan_instrument_shift
itself uses unshare_expr on any uses of the operands to make sure further
modifications in them don't affect other copies of them (the only not
unshared ones are the one the caller then uses for the actual operation
after the instrumentation, which means there is no tree sharing).

Now, if there are side-effects in the first operand like say function
call, cp_save_expr wraps it into a SAVE_EXPR, and ubsan_instrument_shift
in this mode emits something like
if (..., SAVE_EXPR , SAVE_EXPR  > const)
 __ubsan_handle_shift_out_of_bounds (..., SAVE_EXPR , ...);
and caller adds
SAVE_EXPR  << SAVE_EXPR 
after it in a COMPOUND_EXPR.  So far so good.

If there are no side-effects and cp_save_expr doesn't create SAVE_EXPR,
everything is ok as well because of the unshare_expr.
We have
if (..., SAVE_EXPR  > const)
 __ubsan_handle_shift_out_of_bounds (..., ptr->something[i], ...);
and
ptr->something[i] << SAVE_EXPR 
where ptr->something[i] is unshared.

In the testcase below, the !x->s[j] ? 1 : 0 expression is wrapped initially
into a SAVE_EXPR though, and unshare_expr doesn't unshare SAVE_EXPRs nor
anything used in them for obvious reasons, so we end up with:
if (..., SAVE_EXPR (x)->s[j] ?
1 : 0>, SAVE_EXPR  > const)
 __ubsan_handle_shift_out_of_bounds (..., SAVE_EXPR (x)->s[j] ? 1 : 0>, ...);
and
SAVE_EXPR (x)->s[j] ? 1 : 0> <<
SAVE_EXPR 
So far good as well.  But later during cp_fold of the SAVE_EXPR we find
out that VIEW_CONVERT_EXPR(x)->s[j] ? 0 : 1 is actually
invariant (has TREE_READONLY set) and so cp_fold simplifies the above to
if (..., SAVE_EXPR  > const)
 __ubsan_handle_shift_out_of_bounds (..., (bool) VIEW_CONVERT_EXPR(x)->s[j] ? 0 : 1, ...);
and
((bool) VIEW_CONVERT_EXPR(x)->s[j] ? 0 : 1) << SAVE_EXPR

with the s[j] ARRAY_REFs and other expressions shared in between the two
uses (and obviously the expression optimized away from the COMPOUND_EXPR in
the if condition.

Then comes another ubsan instrumentation at genericization time,
this time to instrument the ARRAY_REFs with strict bounds checking,
and replaces the s[j] in there with s[.UBSAN_BOUNDS (0B, SAVE_EXPR, 8),
SAVE_EXPR]
As the trees are shared, it does that just once though.
And as the if body is gimplified first, the SAVE_EXPR is evaluated
inside
of the if body and when it is used again after the if, it uses a
potentially
uninitialized value of j.1 (always uninitialized if the shift count isn't
out of bounds).

The following patch fixes that by unshare_expr unsharing the folded
argument
of a SAVE_EXPR if we've folded the SAVE_EXPR into an invariant and it is
used more than once.

2023-12-08  Jakub Jelinek  

PR sanitizer/112727
* cp-gimplify.cc (cp_fold): If SAVE_EXPR has been previously
folded, unshare_expr what is returned.

* c-c++-common/ubsan/pr112727.c: New test.

(cherry picked from commit 6ddaf06e375e1c15dcda338697ab6ea457e6f497)

[Bug middle-end/112733] [14 Regression] ICE: Segmentation fault in wide-int.cc during GIMPLE pass: sccp

2023-12-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112733

--- Comment #17 from GCC Commits  ---
The releases/gcc-12 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:4bf040fcd23aaa7d8c4873d1170776ab117bc213

commit r12-10055-g4bf040fcd23aaa7d8c4873d1170776ab117bc213
Author: Jakub Jelinek 
Date:   Wed Nov 29 12:26:50 2023 +0100

fold-const: Fix up multiple_of_p [PR112733]

We ICE on the following testcase when wi::multiple_of_p is called on
widest_int 1 and -128 with UNSIGNED.  I still need to work on the
actual wide-int.cc issue, the latest patch attached to the PR regressed
bitint-{38,39}.c, so will need to debug that, but there is a clear bug
on the fold-const.cc side as well - widest_int is a signed representation
by definition, using UNSIGNED with it certainly doesn't match what was
intended, because -128 as the second operand effectively means unsigned
131072 bit 0xf80 integer, not the signed char -128
that appeared in the source.

In the INTEGER_CST case a few lines above this we already use
case INTEGER_CST:
  if (TREE_CODE (bottom) != INTEGER_CST || integer_zerop (bottom))
return false;
  return wi::multiple_of_p (wi::to_widest (top), wi::to_widest
(bottom),
SIGNED);
so I think using SIGNED with widest_int is best there (compared to the
other choices in the PR).

2023-11-29  Jakub Jelinek  

PR middle-end/112733
* fold-const.cc (multiple_of_p): Pass SIGNED rather than
UNSIGNED for wi::multiple_of_p on widest_int arguments.

* gcc.dg/pr112733.c: New test.

(cherry picked from commit 5c95bf945c632925efba86dd5dceccdb9da8884c)

[Bug target/112816] [11/12 Regression] ICE unrecognizable_insn with __builtin_signbit and returning struct with int[4]

2023-12-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112816

--- Comment #13 from GCC Commits  ---
The releases/gcc-12 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:0246a37ebdef4ffc863fe4c56e812c0614026e02

commit r12-10052-g0246a37ebdef4ffc863fe4c56e812c0614026e02
Author: Jakub Jelinek 
Date:   Mon Dec 4 09:00:18 2023 +0100

i386: Fix up signbit2 expander [PR112816]

The following testcase ICEs, because the signbit2 expander uses an
explicit SUBREG in the pattern around match_operand with register_operand
predicate.  If we are unlucky enough that expansion tries to expand it
with some SUBREG as operands[1], we have two nested SUBREGs in the IL,
which is not valid and causes ICE later.

2023-12-04  Jakub Jelinek  

PR target/112816
* config/i386/sse.md (signbit2): Force operands[1] into a
REG.

* gcc.target/i386/sse2-pr112816.c: New test.

(cherry picked from commit 994d6dc64435d6b7c50accca9941ee7decd92a22)

[Bug target/112845] [11/12 Regression] ICE: in extract_insn, at recog.cc:2804 with -Os -fcf-protection -c since r8-3504

2023-12-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112845

--- Comment #7 from GCC Commits  ---
The releases/gcc-12 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:87d013f7c877f944adbbaa4e94244baf990cb9f2

commit r12-10054-g87d013f7c877f944adbbaa4e94244baf990cb9f2
Author: Jakub Jelinek 
Date:   Tue Dec 5 13:17:57 2023 +0100

i386: Fix -fcf-protection -Os ICE due to movabsq peephole2 [PR112845]

The following testcase ICEs in the movabsq $(i32 << shift), r64 peephole2
I've added a while back to use smaller code than movabsq if possible.
If i32 is 0xfa1e0ff3 and shift is not divisible by 8, then it creates
an invalid insn (as 0xfa1e0ff3 CONST_INT is not allowed as
x86_64_immediate_operand nor x86_64_zext_immediate_operand), the peephole2
even triggers on it again and again (this time with shift 0) until it gives
up.

The following patch fixes that.  As ix86_endbr_immediate_operand needs a
CONST_INT and it is hopefully rare, I chose to use FAIL rather than
handling
it in the condition (where I'd probably need to call ctz_hwi again etc.).

2023-12-05  Jakub Jelinek  

PR target/112845
* config/i386/i386.md (movabsq $(i32 << shift), r64 peephole2):
FAIL
if the new immediate is ix86_endbr_immediate_operand.

(cherry picked from commit e0786ca9a18c50ad08c40936b228e325193664b8)

[Bug target/112837] [12 Regression] ICE: RTL check: expected elt 1 type 'i' or 'n', have 'e' (rtx plus) in ix86_elim_entry_set_got, at config/i386/i386.cc:8612 with -fcompare-elim -fpie -fprofile

2023-12-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112837

--- Comment #7 from GCC Commits  ---
The releases/gcc-12 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:d02372f39dc6506a4ff88a7f01fb5570a82606c0

commit r12-10053-gd02372f39dc6506a4ff88a7f01fb5570a82606c0
Author: Jakub Jelinek 
Date:   Mon Dec 4 09:01:09 2023 +0100

i386: Fix rtl checking ICE in ix86_elim_entry_set_got [PR112837]

The following testcase ICEs with RTL checking, because it sets if
XINT (SET_SRC (set), 1) is UNSPEC_SET_GOT without checking if SET_SRC (set)
is actually an UNSPEC, so any time we see any other insn with PARALLEL
and a SET in it which is not an UNSPEC we ICE during RTL checking or
access there some other union member as if it was an rt_int.
The rest is just small cleanup.

2023-12-04  Jakub Jelinek  

PR target/112837
* config/i386/i386.cc (ix86_elim_entry_set_got): Before checking
for UNSPEC_SET_GOT check that SET_SRC is UNSPEC.  Use SET_SRC and
SET_DEST macros instead of XEXP, rename vec variable to set.

* gcc.dg/pr112837.c: New test.

(cherry picked from commit 4586d7d0a92e9b60d0c01043e0ae262b1e06f337)

[Bug c/112339] ICE with clang::no_sanitize and -fsanitize=

2023-12-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112339

--- Comment #6 from GCC Commits  ---
The releases/gcc-12 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:029b826f62d2d193e93539749a534b9a13ade728

commit r12-10048-g029b826f62d2d193e93539749a534b9a13ade728
Author: Jakub Jelinek 
Date:   Thu Nov 9 09:05:54 2023 +0100

attribs: Fix ICE with -Wno-attributes= [PR112339]

The following testcase ICEs, because with -Wno-attributes=foo::no_sanitize
(but generally any other non-gnu namespace and some gnu well known
attribute
name within that other namespace) the FEs don't really parse attribute
arguments of such attribute, but lookup_attribute_spec is non-NULL with
NULL handler and such attributes are added to DECL_ATTRIBUTES or
TYPE_ATTRIBUTES and then when e.g. middle-end does lookup_attribute
on a particular attribute and expects the attribute to mean something
and/or have a particular verified arguments, it can crash when seeing
the foreign attribute in there instead.

The following patch fixes that by never adding ignored attributes
to DECL_ATTRIBUTES/TYPE_ATTRIBUTES, previously that was the case just
for attributes in ignored namespace (where lookup_attribute_space
returned NULL).  We don't really know anything about those attributes,
so shouldn't pretend we know something about them, especially when
the arguments are error_mark_node or NULL instead of something that
would have been parsed.  And it would be really weird if we normally
ignore say [[clang::unused]] attribute, but when people use
-Wno-attributes=clang::unused we actually treated it as gnu::unused.
All the user asked for is suppress warnings about that attribute being
unknown.

The first hunk is just playing safe, I'm worried people could
-Wno-attributes=gnu::
and get various crashes with known GNU attributes not being actually
parsed and recorded (or worse e.g. when we tweak standard attributes
into GNU attributes and we wouldn't add those).
The -Wno-attributes= documentation says that it suppresses warning about
unknown attributes, so I think -Wno-attributes=gnu:: should prevent
warning about say [[gnu::foobarbaz]] attribute, but not about
[[gnu::unused]] because the latter is a known attribute.
The routine would return true for any scoped attribute in the ignored
namespace, with the change it ignores only unknown attributes in ignored
namespace, known ones in there will be ignored only if they have
max_length of -2 (e.g.. with
-Wno-attributes=gnu:: -Wno-attributes=gnu::foobarbaz).

2023-11-09  Jakub Jelinek  

PR c/112339
* attribs.cc (attribute_ignored_p): Only return true for
attr_namespace_ignored_p if as is NULL.
(decl_attributes): Never add ignored attributes.

* c-c++-common/ubsan/Wno-attributes-1.c: New test.

(cherry picked from commit 533241c6c60bc7c9f7dc47a94e94b5eed1b370e6)

[Bug c++/112795] [C++>=14] ICE pragma GCC unroll (n) cxx_eval_constant_expression

2023-12-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112795

--- Comment #11 from GCC Commits  ---
The releases/gcc-12 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:f1761303c77219281c22ab50b4a9ce65920c4023

commit r12-10051-gf1761303c77219281c22ab50b4a9ce65920c4023
Author: Jakub Jelinek 
Date:   Mon Dec 4 08:59:15 2023 +0100

c++: #pragma GCC unroll C++ fixes [PR112795]

foo in the unroll-5.C testcase ICEs because cp_parser_pragma_unroll
during parsing calls maybe_constant_value unconditionally, which is
fine if !processing_template_decl, but can ICE otherwise.

While just calling fold_non_dependent_expr there instead could be enough
to fix the ICE (and I guess the right thing to do for backports if any),
I don't see a reason why we couldn't handle a dependent #pragma GCC unroll
argument as well, the unrolling isn't done in the FE and all the middle-end
cares about is that ANNOTATE_EXPR has a 1..65534 last operand when it is
annot_expr_unroll_kind.

So, the following patch changes all the unsigned short unroll arguments
to tree unroll (and thus avoids the tree -> unsigned short -> tree
conversions), does the type and value checking during parsing only if
the argument isn't dependent and repeats it during instantiation.

2023-12-04  Jakub Jelinek  

PR c++/112795
gcc/cp/
* parser.cc (cp_parser_pragma_unroll): Use fold_non_dependent_expr
instead of maybe_constant_value.
gcc/testsuite/
* g++.dg/ext/unroll-5.C: New test.

(cherry picked from commit b6c78feea08c36e5754818c6a3d7536b3f8913dc)

[Bug target/111408] [14 Regression] Wrong code at -O2/3 on x86_64-linux-gnu since r14-2866-ge68a31549d9

2023-12-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111408

--- Comment #8 from GCC Commits  ---
The releases/gcc-12 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:9cb9cecdb582b44b04e9c49532ddf0065b5c3612

commit r12-10050-g9cb9cecdb582b44b04e9c49532ddf0065b5c3612
Author: Jakub Jelinek 
Date:   Sat Nov 25 10:31:55 2023 +0100

i386: Fix up *jcc_bt*_mask{,_1} [PR111408]

The following testcase is miscompiled in GCC 14 because the
*jcc_bt_mask and *jcc_bt_mask_1 patterns have just
one argument in (match_operator 0 "bt_comparison_operator" [...])
but as bt_comparison_operator is eq,ne, we need two.
The md readers don't warn about it, after all, some checks can
be done in the predicate rather than specified explicitly, and the
behavior is that anything is accepted as the second argument.

I went through all other i386.md match_operator uses and all others
looked right (extract_operator using 3 operands, all others 2).

I think we'll want to fix this at different spots in older releases
because I think the bug was introduced already in 2008, though most
likely just latent.

2023-11-25  Jakub Jelinek  

PR target/111408
* config/i386/i386.md (*jcc_bt_mask): Add (const_int 0) as
expected second operand of bt_comparison_operator.

* gcc.c-torture/execute/pr111408.c: New test.

(cherry picked from commit 9866c98e1015d98b8fc346d7cf73a0070cce5f69)

[Bug tree-optimization/111967] [12 Regression] during GIMPLE pass: evrp ICE: in operator[], at vec.h:910 with -O2 -fno-tree-forwprop -fdump-tree-evrp-all since r12-4694

2023-12-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111967

--- Comment #8 from GCC Commits  ---
The releases/gcc-12 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:327713d54122ab5635f3c385aecf113e5afe190c

commit r12-10049-g327713d54122ab5635f3c385aecf113e5afe190c
Author: Jakub Jelinek 
Date:   Mon Nov 13 08:47:41 2023 +0100

gimple-range-cache: Fix ICEs when dumping details [PR111967]

The following testcase ICEs when dumping details.
When m_ssa_ranges vector is created, it is safe_grow_cleared
(num_ssa_names),
but when when some new SSA_NAME is added, we strangely grow it to
num_ssa_names + 1 instead and later on the 3 argument dump method
iterates from 1 to m_ssa_ranges.length () - 1 and uses ssa_name (x)
on each; but because set_bb_range grew it one too much, ssa_name
(m_ssa_ranges.length () - 1) might be after the end of the ssanames
vector and ICE.

The fix grows the vector consistently only to num_ssa_names,
doesn't waste time checking m_ssa_ranges[0] because there is no
ssa_names (0), it is always NULL, before using ssa_name (x) checks
if we'll need it at all (we check later if m_ssa_ranges[x] is non-NULL,
so we might check it earlier as well) and also in the last loop
iterates until m_ssa_ranges.length () rather than num_ssa_names, I don't
see a reason for the inconsistency and in theory some SSA_NAME could be
added without set_bb_range called for it and the vector could be shorter
than the ssanames vector.

To actually fix the ICE, either the first hunk or the last 2 hunks
would be enough, but I think it doesn't hurt to change all the spots.

2023-11-13  Jakub Jelinek  

PR tree-optimization/111967
* gimple-range-cache.cc (block_range_cache::set_bb_range): Grow
m_ssa_ranges to num_ssa_names rather than num_ssa_names + 1.
(block_range_cache::dump): Iterate from 1 rather than 0.  Don't use
ssa_name (x) unless m_ssa_ranges[x] is non-NULL.  Iterate to
m_ssa_ranges.length () rather than num_ssa_names.

* gcc.dg/tree-ssa/pr111967.c: New test.

(cherry picked from commit 5a0c302d2d721b9650c1e354695dbba87364c334)

[Bug tree-optimization/110731] [11/12 Regression] Wrong-code because of wide-int division since r5-424

2023-12-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110731

--- Comment #6 from GCC Commits  ---
The releases/gcc-12 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:9a93c334865af18ce2fd50cd77a9e90939f3c231

commit r12-10046-g9a93c334865af18ce2fd50cd77a9e90939f3c231
Author: Jakub Jelinek 
Date:   Wed Jul 19 13:48:53 2023 +0200

wide-int: Fix up wi::divmod_internal [PR110731]

As the following testcase shows, wi::divmod_internal doesn't handle
correctly signed division with precision > 64 when the dividend (and likely
divisor as well) is the type's minimum and the precision isn't divisible
by 64.

A few lines above what the patch hunk changes is:
  /* Make the divisor and dividend positive and remember what we
 did.  */
  if (sgn == SIGNED)
{
  if (wi::neg_p (dividend))
{
  neg_dividend = -dividend;
  dividend = neg_dividend;
  dividend_neg = true;
}
  if (wi::neg_p (divisor))
{
  neg_divisor = -divisor;
  divisor = neg_divisor;
  divisor_neg = true;
}
}
i.e. we negate negative dividend or divisor and remember those.
But, after we do that, when unpacking those values into b_dividend and
b_divisor we need to always treat the wide_ints as UNSIGNED,
because divmod_internal_2 performs an unsigned division only.
Now, if precision <= 64, we don't reach here at all, earlier code
handles it.  If dividend or divisor aren't the most negative values,
the negation clears their most significant bit, so it doesn't really
matter if we unpack SIGNED or UNSIGNED.  And if precision is multiple
of HOST_BITS_PER_WIDE_INT, there is no difference in behavior, while
-0x8000 negates to
-0x8000 the unpacking of it as SIGNED
or UNSIGNED works the same.
In the testcase, we have signed precision 119 and the dividend is
val = { 0, 0xffc0 }, len = 2, precision = 119
both before and after negation.
Divisor is
val = { 2 }, len = 1, precision = 119
But we really want to divide 0x40 by 2
unsigned and then negate at the end.
If it is unsigned precision 119 division
0x40 by 2
dividend is
val = { 0, 0xffc0 }, len = 2, precision = 119
but as we unpack it UNSIGNED, it is unpacked into
0, 0, 0, 0x0040

The following patch fixes it by always using UNSIGNED unpacking
because we've already negated negative values at that point if
sgn == SIGNED and so most negative constants should be treated as
positive.

2023-07-19  Jakub Jelinek  

PR tree-optimization/110731
* wide-int.cc (wi::divmod_internal): Always unpack dividend and
divisor as UNSIGNED regardless of sgn.

* gcc.dg/pr110731.c: New test.

(cherry picked from commit ece799607c841676f4e00c2fea98bbec6976da3f)

Re: [PATCH] libstdc++: Make __gnu_debug::vector usable in constant expressions [PR109536]

2023-12-15 Thread Patrick Palka
On Wed, 6 Dec 2023, Jonathan Wakely wrote:

> Any comments on this approach?
> 
> -- >8 --
> 
> This makes constexpr std::vector (mostly) work in Debug Mode. All safe
> iterator instrumentation and checking is disabled during constant
> evaluation, because it requires mutex locks and calls to non-inline
> functions defined in libstdc++.so. It should be OK to disable the safety
> checks, because most UB should be detected during constant evaluation
> anyway.
> 
> We could try to enable the full checking in constexpr, but it would mean
> wrapping all the non-inline functions like _M_attach with an inline
> _M_constexpr_attach that does the iterator housekeeping inline without
> mutex locks when calling for constant evaluation, and calls the
> non-inline function at runtime. That could be done in future if we find
> that we've lost safety or useful checking by disabling the safe
> iterators.
> 
> There are a few test failures in C++20 mode, which I'm unable to
> explain. The _Safe_iterator::operator++() member gives errors for using
> non-constexpr functions during constant evaluation, even though those
> functions are guarded by std::is_constant_evaluated() checks. The same
> code works fine for C++23 and up.

AFAICT these C++20 test failures are really due to the variable
definition of non-literal type

381__gnu_cxx::__scoped_lock __l(this->_M_get_mutex());

which were prohibited in a constexpr function (even if that code was
never executed) until C++23's P2242R3.

We can use an immediately invoked lambda to work around this:

381[this] {
382  __gnu_cxx::__scoped_lock __l(this->_M_get_mutex());
383  ++base();
384}();
385return *this;

> 
> libstdc++-v3/ChangeLog:
> 
>   PR libstdc++/109536
>   * include/bits/c++config (__glibcxx_constexpr_assert): Remove
>   macro.
>   * include/bits/stl_algobase.h (__niter_base, __copy_move_a)
>   (__copy_move_backward_a, __fill_a, __fill_n_a, __equal_aux)
>   (__lexicographical_compare_aux): Add constexpr to overloads for
>   debug mode iterators.
>   * include/debug/helper_functions.h (__unsafe): Add constexpr.
>   * include/debug/macros.h (_GLIBCXX_DEBUG_VERIFY_COND_AT): Remove
>   macro, folding it into ...
>   (_GLIBCXX_DEBUG_VERIFY_AT_F): ... here. Do not use
>   __glibcxx_constexpr_assert.
>   * include/debug/safe_base.h (_Safe_iterator_base): Add constexpr
>   to some member functions. Omit attaching, detaching and checking
>   operations during constant evaluation.
>   * include/debug/safe_container.h (_Safe_container): Likewise.
>   * include/debug/safe_iterator.h (_Safe_iterator): Likewise.
>   * include/debug/safe_iterator.tcc (__niter_base, __copy_move_a)
>   (__copy_move_backward_a, __fill_a, __fill_n_a, __equal_aux)
>   (__lexicographical_compare_aux): Add constexpr.
>   * include/debug/vector (_Safe_vector, vector): Add constexpr.
>   Omit safe iterator operations during constant evaluation.
>   * testsuite/23_containers/vector/bool/capacity/constexpr.cc:
>   Remove dg-xfail-if for debug mode.
>   * testsuite/23_containers/vector/bool/cmp_c++20.cc: Likewise.
>   * testsuite/23_containers/vector/bool/cons/constexpr.cc:
>   Likewise.
>   * testsuite/23_containers/vector/bool/element_access/1.cc:
>   Likewise.
>   * testsuite/23_containers/vector/bool/element_access/constexpr.cc:
>   Likewise.
>   * testsuite/23_containers/vector/bool/modifiers/assign/constexpr.cc:
>   Likewise.
>   * testsuite/23_containers/vector/bool/modifiers/constexpr.cc:
>   Likewise.
>   * testsuite/23_containers/vector/bool/modifiers/swap/constexpr.cc:
>   Likewise.
>   * testsuite/23_containers/vector/capacity/constexpr.cc:
>   Likewise.
>   * testsuite/23_containers/vector/cmp_c++20.cc: Likewise.
>   * testsuite/23_containers/vector/cons/constexpr.cc: Likewise.
>   * testsuite/23_containers/vector/data_access/constexpr.cc:
>   Likewise.
>   * testsuite/23_containers/vector/element_access/constexpr.cc:
>   Likewise.
>   * testsuite/23_containers/vector/modifiers/assign/constexpr.cc:
>   Likewise.
>   * testsuite/23_containers/vector/modifiers/constexpr.cc:
>   Likewise.
>   * testsuite/23_containers/vector/modifiers/swap/constexpr.cc:
>   Likewise.
> ---
>  libstdc++-v3/include/bits/c++config   |   9 -
>  libstdc++-v3/include/bits/stl_algobase.h  |  15 ++
>  libstdc++-v3/include/debug/helper_functions.h |   1 +
>  libstdc++-v3/include/debug/macros.h   |   9 +-
>  libstdc++-v3/include/debug/safe_base.h|  35 +++-
>  libstdc++-v3/include/debug/safe_container.h   |  15 +-
>  libstdc++-v3/include/debug/safe_iterator.h| 186 +++---
>  libstdc++-v3/include/debug/safe_iterator.tcc  |  15 ++
>  libstdc++-v3/include/debug/vector | 146 --
>  .../vector/bool/capacity/constexpr.cc |   1 -
>  

[PATCH] c++: Fix unchecked use of CLASSTYPE_AS_BASE [PR113031]

2023-12-15 Thread Nathaniel Shead
Bootstrapped and regtested on x86_64-pc-linux-gnu with GLIBCXX_TESTSUITE_STDS=20
and RUNTESTFLAGS="--target_board=unix/-D_GLIBCXX_USE_CXX11_ABI=0".

-- >8 --

My previous patch (naively) assumed that a TREE_CODE of RECORD_TYPE or
UNION_TYPE was sufficient for optype to be considered a "class type".
However, this does not account for e.g. template type parameters of
record or union type. This patch corrects to check for CLASS_TYPE_P
before checking for as-base conversion.

PR c++/113031

gcc/cp/ChangeLog:

* constexpr.cc (cxx_fold_indirect_ref_1): Check for CLASS_TYPE
before using CLASSTYPE_AS_BASE.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/pr113031.C: New test.

Signed-off-by: Nathaniel Shead 
---
 gcc/cp/constexpr.cc   |  3 ++-
 gcc/testsuite/g++.dg/cpp0x/pr113031.C | 34 +++
 2 files changed, 36 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/pr113031.C

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index e1b2d27fc36..051f73fb73f 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -5709,7 +5709,8 @@ cxx_fold_indirect_ref_1 (const constexpr_ctx *ctx, 
location_t loc, tree type,
  }
 
   /* Handle conversion to "as base" type.  */
-  if (CLASSTYPE_AS_BASE (optype) == type)
+  if (CLASS_TYPE_P (optype)
+ && CLASSTYPE_AS_BASE (optype) == type)
return op;
 
   /* Handle conversion to an empty base class, which is represented with a
diff --git a/gcc/testsuite/g++.dg/cpp0x/pr113031.C 
b/gcc/testsuite/g++.dg/cpp0x/pr113031.C
new file mode 100644
index 000..aecdc3fc4b2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/pr113031.C
@@ -0,0 +1,34 @@
+// PR c++/113031
+// { dg-do compile }
+
+template  struct variant;
+
+template 
+variant<_Types> __variant_cast(_Tp __rhs) { return 
static_cast&>(__rhs); }
+
+template 
+struct _Move_assign_base : _Types {
+  void operator=(_Move_assign_base __rhs) { __variant_cast<_Types>(__rhs); }
+};
+
+template 
+struct variant : _Move_assign_base<_Types> {
+  void emplace() {
+variant __tmp;
+*this = __tmp;
+  }
+};
+
+struct _Undefined_class {
+  struct _Nocopy_types {
+void (_Undefined_class::*_M_member_pointer)();
+  };
+  struct function : _Nocopy_types {
+struct optional {
+  void test03() {
+variant v;
+v.emplace();
+  }
+};
+  };
+};
-- 
2.42.0



Re: [PATCH] RISC-V: Don't make Ztso imply A

2023-12-15 Thread Andrew Waterman
On Fri, Dec 15, 2023 at 1:38 PM Jeff Law  wrote:
>
>
>
> On 12/12/23 20:54, Palmer Dabbelt wrote:
> > I can't actually find anything in the ISA manual that makes Ztso imply
> > A.  In theory the memory ordering is just a different thing that the set
> > of availiable instructions (ie, Ztso without A would still imply TSO for
> > loads and stores).  It also seems like a configuration that could be
> > sane to build: without A it's all but impossible to write any meaningful
> > multi-core code, and TSO is really cheap for a single core.
> >
> > That said, I think it's kind of reasonable to provide A to users asking
> > for Ztso.  So maybe even if this was a mistake it's the right thing to
> > do?
> >
> > gcc/ChangeLog:
> >
> >   * common/config/riscv/riscv-common.cc (riscv_implied_info):
> >   Remove {"ztso", "a"}.
> I'd tend to think step #1 is to determine what the ISA intent is,
> meaning engagement with RVI.
>
> We've got time for that engagement and to adjust based on the result.
> So I'd tend to defer until we know if Ztso should imply A or not.

Palmer is correct.  There is no coupling between Ztso and A.  (And
there are uncontrived examples of such systems: e.g. embedded
processors without caches that don't support the LR/SC instructions,
but happen to be TSO.)

>
> jeff


Re: Request for Direction.

2023-12-15 Thread James K. Lowden
On Fri, 15 Dec 2023 14:43:22 -0500
"David H. Lynch Jr. via Gcc"  wrote:

> Right now I am just focused on some means to deliver support. 

Hi David, 

My colleague Bob Dubner and I have been extending GCC every day for
the last two years.  I wonder if we might be of some use to you.  

I only faintly hope our project can benefit from your work. We're
adding a Cobol front end to GCC.  Cobol has built-in sort functions,
both on disk and in memory, and a rich data-description language.
There is more potential there than might seem at first blush, and I
would welcome the opportunity to explain in detail if you're
interested.  

If your objective is simply to extend C to support content addressable
memory, then we might still be of some help.  I don't know anything,
really, about the C front-end, but Bob has experience getting
Generic to generate code.  He might be able to answer some of your
questions, if nothing else.

Let me know what you think.  

Kind regards, 

--jkl



[Bug c++/55004] [meta-bug] constexpr issues

2023-12-15 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55004
Bug 55004 depends on bug 86970, which changed state.

Bug 86970 Summary: Rejected constexpr expression involving lambdas and 
inheritance, "use of this in a constant expression"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86970

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug c++/86970] Rejected constexpr expression involving lambdas and inheritance, "use of this in a constant expression"

2023-12-15 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86970

Patrick Palka  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
   Target Milestone|--- |10.0
 CC||ppalka at gcc dot gnu.org
 Resolution|--- |FIXED

--- Comment #6 from Patrick Palka  ---
Fixed with -std=c++17 since r10-3661-g8e007055dd1374 apparently.

Re: [PATCH v4 3/3] RISC-V: Add support for XCVbi extension in CV32E40P

2023-12-15 Thread Jeff Law




On 12/12/23 12:32, Mary Bennett wrote:

Spec: 
github.com/openhwgroup/core-v-sw/blob/master/specifications/corev-builtin-spec.md

Contributors:
   Mary Bennett 
   Nandni Jamnadas 
   Pietra Ferreira 
   Charlie Keaney
   Jessica Mills
   Craig Blackmore 
   Simon Cook 
   Jeremy Bennett 
   Helene Chelin 

gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: Create XCVbi extension
  support.
* config/riscv/riscv.opt: Likewise.
* config/riscv/corev.md: Implement cv_branch pattern
  for cv.beqimm and cv.bneimm.
* config/riscv/riscv.md: Add CORE-V branch immediate to RISC-V
  branch instruction pattern.
* config/riscv/constraints.md: Implement constraints
  cv_bi_s5 - signed 5-bit immediate.
* config/riscv/predicates.md: Implement predicate
  const_int5s_operand - signed 5 bit immediate.
* doc/sourcebuild.texi: Add XCVbi documentation.

gcc/testsuite/ChangeLog:
* gcc.target/riscv/cv-bi-beqimm-compile-1.c: New test.
* gcc.target/riscv/cv-bi-beqimm-compile-2.c: New test.
* gcc.target/riscv/cv-bi-bneimm-compile-1.c: New test.
* gcc.target/riscv/cv-bi-bneimm-compile-2.c: New test.
* lib/target-supports.exp: Add proc for XCVbi.
---
  gcc/common/config/riscv/riscv-common.cc   |  2 +
  gcc/config/riscv/constraints.md   |  6 +++
  gcc/config/riscv/corev.md | 32 +
  gcc/config/riscv/predicates.md|  4 ++
  gcc/config/riscv/riscv.md |  2 +-
  gcc/config/riscv/riscv.opt|  2 +
  gcc/doc/sourcebuild.texi  |  3 ++
  .../gcc.target/riscv/cv-bi-beqimm-compile-1.c | 17 +++
  .../gcc.target/riscv/cv-bi-beqimm-compile-2.c | 48 +++
  .../gcc.target/riscv/cv-bi-bneimm-compile-1.c | 17 +++
  .../gcc.target/riscv/cv-bi-bneimm-compile-2.c | 48 +++
  gcc/testsuite/lib/target-supports.exp | 13 +
  12 files changed, 193 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/gcc.target/riscv/cv-bi-beqimm-compile-1.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/cv-bi-beqimm-compile-2.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/cv-bi-bneimm-compile-1.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/cv-bi-bneimm-compile-2.c




diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md
index 2711efe68c5..718b4bd77df 100644
--- a/gcc/config/riscv/constraints.md
+++ b/gcc/config/riscv/constraints.md
@@ -247,3 +247,9 @@
(and (match_code "const_int")
 (and (match_test "IN_RANGE (ival, 0, 1073741823)")
  (match_test "exact_log2 (ival + 1) != -1"
+
+(define_constraint "CV_bi_sign5"
+  "@internal
+   A 5-bit signed immediate for CORE-V Immediate Branch."
+  (and (match_code "const_int")
+   (match_test "IN_RANGE (ival, -16, 15)")))
diff --git a/gcc/config/riscv/corev.md b/gcc/config/riscv/corev.md
index 92bf0b5d6a6..92e30a8ae04 100644
--- a/gcc/config/riscv/corev.md
+++ b/gcc/config/riscv/corev.md
@@ -706,3 +706,35 @@
  
[(set_attr "type" "load")

(set_attr "mode" "SI")])
+
+;; XCVBI Instructions
+(define_insn "cv_branch" > +  [(set (pc)
+   (if_then_else
+(match_operator 1 "equality_operator"
+[(match_operand:X 2 "register_operand" "r")
+ (match_operand:X 3 "const_int5s_operand" 
"CV_bi_sign5")])
+(label_ref (match_operand 0 "" ""))
+(pc)))]
+  "TARGET_XCVBI"
+  "cv.b%C1imm\t%2,%3,%0"
+  [(set_attr "type" "branch")
+   (set_attr "mode" "none")])

So I think Kito wanted the name of this pattern to be prefixed with '*'.

My question is how does that pattern deal with out of range branch 
targets?  As Kito mentioned on the V3, you probably need to handle that.



I think this suggestion from Kito was meant to be added to that pattern 
so that it works in a manner similar to the *branch pattern:



if (get_attr_length (insn) == 12)
  return "cv.b%N1\t%2,%z3,1f; jump\t%l0,ra; 1:";



Jeff


Re: [PATCH v4 2/3] RISC-V: Update XCValu constraints to match other vendors

2023-12-15 Thread Jeff Law




On 12/12/23 12:32, Mary Bennett wrote:

gcc/ChangeLog:
* config/riscv/constraints.md: CVP2 -> CV_alu_pow2.
* config/riscv/corev.md: Likewise.
---

Kito ack'd the V3 patch, so I went ahead and pushed this to the trunk.

jeff


[Bug other/113038] New: [14 regression] Excess errors for g++.dg/modules/hello-1_b.C after r14-6569-gfe54b57728c09a

2023-12-15 Thread seurer at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113038

Bug ID: 113038
   Summary: [14 regression] Excess errors for
g++.dg/modules/hello-1_b.C  after
r14-6569-gfe54b57728c09a
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: seurer at gcc dot gnu.org
  Target Milestone: ---

g:fe54b57728c09ab0389e2bb3f079d5210566199d, r14-6569-gfe54b57728c09a

FAIL: g++.dg/modules/hello-1 -std=c++2b execute
FAIL: g++.dg/modules/hello-1_b.C -std=c++2b (test for excess errors)


spawn -ignore SIGHUP
/home/seurer/gcc/git/build/gcc-test/gcc/testsuite/g++13/../../xg++
-B/home/seurer/gcc/git/build/gcc-test/gcc/testsuite/g++13/../../
/home/seurer/gcc/git/gcc-test/gcc/testsuite/g++.dg/modules/hello-1_b.C
-fdiagnostics-plain-output -nostdinc++
-I/home/seurer/gcc/git/build/gcc-test/powerpc64le-unknown-linux-gnu/libstdc++-v3/include/powerpc64le-unknown-linux-gnu
-I/home/seurer/gcc/git/build/gcc-test/powerpc64le-unknown-linux-gnu/libstdc++-v3/include
-I/home/seurer/gcc/git/gcc-test/libstdc++-v3/libsupc++
-I/home/seurer/gcc/git/gcc-test/libstdc++-v3/include/backward
-I/home/seurer/gcc/git/gcc-test/libstdc++-v3/testsuite/util -fmessage-length=0
-std=c++2b -pedantic-errors -Wno-long-long -fmodules-ts -S -o hello-1_b.s^M
/home/seurer/gcc/git/build/gcc-test/powerpc64le-unknown-linux-gnu/libstdc++-v3/include/bits/locale_classes.tcc:
In instantiation of 'const _Facet* std::__try_use_facet@hello(const
locale@hello&) [with _Facet = ctype@hello]':^M
/home/seurer/gcc/git/build/gcc-test/powerpc64le-unknown-linux-gnu/libstdc++-v3/include/bits/locale_classes.tcc:200:59:
  required from 'const _Facet& std::use_facet@hello(const locale@hello&) [with
_Facet = ctype@hello]'^M
/home/seurer/gcc/git/build/gcc-test/powerpc64le-unknown-linux-gnu/libstdc++-v3/include/bits/locale_facets_nonio.tcc:1573:63:
  required from '_InIter std::__cxx11::time_get@hello<_CharT,
_InIter>::do_get(iter_type, iter_type, std::ios_base&, std::ios_base::iostate&,
tm*, char, char) const [with _CharT = char; _InIter =
std::istreambuf_iterator >; iter_type =
std::istreambuf_iterator >; std::ios_base::iostate
= std::ios_base::iostate]'^M
/home/seurer/gcc/git/build/gcc-test/powerpc64le-unknown-linux-gnu/libstdc++-v3/include/bits/locale_facets_nonio.tcc:1567:5:
  required from here^M
: warning: new declaration 'void* __cxxabiv1::__dynamic_cast(const
void*, const __class_type_info*, const __class_type_info*, long int)'
ambiguates built-in declaration 'void* __cxxabiv1::__dynamic_cast@hello(const
void*, const __class_type_info@hello*, const __class_type_info@hello*, long
int)' [-Wbuiltin-declaration-mismatch]^M
FAIL: g++.dg/modules/hello-1_b.C -std=c++2b (test for excess errors)
Excess errors:
: warning: new declaration 'void* __cxxabiv1::__dynamic_cast(const
void*, const __class_type_info*, const __class_type_info*, long int)'
ambiguates built-in declaration 'void* __cxxabiv1::__dynamic_cast@hello(const
void*, const __class_type_info@hello*, const __class_type_info@hello*, long
int)' [-Wbuiltin-declaration-mismatch]


commit fe54b57728c09ab0389e2bb3f079d5210566199d (HEAD)
Author: Jonathan Wakely 
Date:   Thu Dec 14 23:23:34 2023 +

libstdc++: Implement C++23  header [PR107760]

gcc-12-20231215 is now available

2023-12-15 Thread GCC Administrator via Gcc
Snapshot gcc-12-20231215 is now available on
  https://gcc.gnu.org/pub/gcc/snapshots/12-20231215/
and on various mirrors, see https://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 12 git branch
with the following options: git://gcc.gnu.org/git/gcc.git branch 
releases/gcc-12 revision 5c3ab44771d0524140cf2ce5de594fcf7fefcd6f

You'll find:

 gcc-12-20231215.tar.xz   Complete GCC

  SHA256=d4781bdacb5dc60f013067fab33100f8b1dc142e15e4a913d26260cd6d790f4b
  SHA1=3eadf821d9a547482620710861e707c29787eddd

Diffs from 12-20231208 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-12
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


[Bug fortran/97592] [11/12/13/14 Regression] Incorrectly set pointer remapping with array pointer argument to CONTIGUOUS dummy

2023-12-15 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97592

anlauf at gcc dot gnu.org changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |anlauf at gcc dot 
gnu.org

--- Comment #6 from anlauf at gcc dot gnu.org ---
(In reply to anlauf from comment #3)
> It looks like argument association is confused here.
> (The F2018 reference is 15.5.2.3 and 15.5.2.4).
> 
> The following patch appears to fix the testcase:
> 
> diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc
> index 06713f24f95..c7fb4633ab1 100644
> --- a/gcc/fortran/trans-expr.cc
> +++ b/gcc/fortran/trans-expr.cc
> @@ -6854,7 +6854,7 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
>  INTENT_IN, fsym->attr.pointer);
> }
>   else if (fsym && fsym->attr.contiguous
> -  && !gfc_is_simply_contiguous (e, false, true)
> +  && gfc_is_not_contiguous (e)
>&& gfc_expr_is_variable (e))
> {
>   gfc_conv_subref_array_arg (, e, nodesc_arg,
> 
> but unfortunately regresses on gfortran.dg/bind-c-contiguous-3.f90 :-(

This works:

diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc
index b2463a28748..7bc6d72decc 100644
--- a/gcc/fortran/trans-expr.cc
+++ b/gcc/fortran/trans-expr.cc
@@ -7124,7 +7125,9 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
 INTENT_IN, fsym->attr.pointer);
}
  else if (fsym && fsym->attr.contiguous
-  && !gfc_is_simply_contiguous (e, false, true)
+  && (fsym->attr.target
+  ? gfc_is_not_contiguous (e)
+  : !gfc_is_simply_contiguous (e, false, true))
   && gfc_expr_is_variable (e))
{
  gfc_conv_subref_array_arg (, e, nodesc_arg,

[Bug target/112896] RISC-V: gcc.dg/pr30957-1.c run failure when rv64gcv_zvl1024b_zvfh_zfh

2023-12-15 Thread ewlu at rivosinc dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112896

Edwin Lu  changed:

   What|Removed |Added

 CC||ewlu at rivosinc dot com

--- Comment #1 from Edwin Lu  ---
(In reply to Li Pan from comment #0)
> The gcc.dg/pr30957-1.c test case is failed in RISC-V backend when build with
> below options.
> 
> -march=rv64gcv_zvl1024b_zvfh_zfh -mabi=lp64d  -O2 -mcmodel=medlow
> --param=riscv-autovec-preference=fixed-vlmax -funroll-loops
> -fassociative-math -fno-trapping-math -fno-signed-zeros
> -fvariable-expansion-in-unroller -fdump-rtl-expand-details -lm
> gcc/testsuite/gcc.dg/pr30957-1.c -o test.elf
> 
> The test gcc/testsuite/gcc.dg/pr30957-1.c may be similar as below.
> 
> float __attribute__((noinline))
> foo (float d, int n)
> {
>   unsigned i;
>   float accum = d;
> 
>   for (i = 0; i < n; i++)
> accum += d;
> 
>   return accum;
> }
> 
> int
> main ()
> {
>   /* When compiling standard compliant we expect foo to return -0.0.  But the
>  variable expansion during unrolling optimization (for this testcase
> enabled
>  by non-compliant -fassociative-math) instantiates copy(s) of the
>  accumulator which it initializes with +0.0.  Hence we expect that foo
>  returns +0.0.  */
>   if (__builtin_copysignf (1.0, foo (0.0 / -5.0, 10)) != 1.0)
> abort ();
>   exit (0);
> }
> 
> Have an initial investigation that RISC-V backend always get LPT_NONE when
> unroll_loops, as the step of loop will be dynamic after vectorizing, and get
> the simple loop flag as false, then the pass unroll_loops will do nothing
> for non simple loop.
> 
> We may need further investigation for this case.

Our postcommit ci recently identified the testcase aborts on several other
configurations now.
Additional targets (linux and newlib): 
- rv64gcv
- rv64 Vector Crypto
- rv64 RVA23U64 Profile
Logs and testsuite report can be found
https://github.com/patrick-rivos/gcc-postcommit-ci/issues/290

Re: [PATCH v4 1/3] RISC-V: Add support for XCVelw extension in CV32E40P

2023-12-15 Thread Jeff Law




On 12/12/23 12:32, Mary Bennett wrote:

Spec: 
github.com/openhwgroup/core-v-sw/blob/master/specifications/corev-builtin-spec.md

Contributors:
   Mary Bennett 
   Nandni Jamnadas 
   Pietra Ferreira 
   Charlie Keaney
   Jessica Mills
   Craig Blackmore 
   Simon Cook 
   Jeremy Bennett 
   Helene Chelin 

gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: Add XCVelw.
* config/riscv/corev.def: Likewise.
* config/riscv/corev.md: Likewise.
* config/riscv/riscv-builtins.cc (AVAIL): Likewise.
* config/riscv/riscv-ftypes.def: Likewise.
* config/riscv/riscv.opt: Likewise.
* doc/extend.texi: Add XCVelw builtin documentation.
* doc/sourcebuild.texi: Likewise.

gcc/testsuite/ChangeLog:
* gcc.target/riscv/cv-elw-compile-1.c: Create test for cv.elw.
* testsuite/lib/target-supports.exp: Add proc for the XCVelw extension.
Kito ACK'd V3.   I'm going to go ahead and push this to the trunk on 
Mary's behalf.  It looks independent to me and there's no need for it to 
wait.


jeff


Re: [PATCH] RISC-V: Add -fno-vect-cost-model to pr112773 testcase

2023-12-15 Thread Jeff Law




On 12/14/23 14:32, Patrick O'Neill wrote:

The testcase for pr112773 started passing after r14-6472-g8501edba91e
which was before the actual fix. This patch adds -fno-vect-cost-model
which prevents the testcase from passing due to the vls change.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/partial/pr112773.c: Add
-fno-vect-cost-model.

Signed-off-by: Patrick O'Neill 

I've pushed this to the trunk.

jeff


[Bug target/105733] riscv: Poor codegen for large stack frames

2023-12-15 Thread vineetg at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105733

Vineet Gupta  changed:

   What|Removed |Added

 CC||vineetg at gcc dot gnu.org

--- Comment #4 from Vineet Gupta  ---
There has been good improvements in gcc codegen specially with commit below.

commit 6619b3d4c15cd754798b1048c67f3806bbcc2e6d
Author: Jivan Hakobyan 
Date:   Wed Aug 23 14:10:30 2023 -0600

Improve quality of code from LRA register elimination

This is primarily Jivan's work, I'm mostly responsible for the write-up and
coordinating with Vlad on a few questions.

On targets with limitations on immediates usable in arithmetic
instructions,
LRA's register elimination phase can construct fairly poor code.

 Tip W/o commit 6619b3d4c| With 6619b3d4c   
 |
foo: | foo:
li  t0,-4096 |  li  t0,-4096
addit0,t0,2032   |  addit0,t0,2032
li  a5,0 |
li  a4,0 |
add sp,sp,t0 |  add sp,sp,t0
add a4,a4,a5 |
add a5,a4,sp |  add a5,a5,a0
add a5,a5,a0 |
li  t0,4096  |  li  t0,4096
sb  zero,0(a5)   |  sb  zero,0(a5)
addit0,t0,-2032  |  addit0,t0,-2032
add sp,sp,t0 |  add sp,sp,t0
jr  ra   |  jr  ra

We still have the weird LUI 4096 based constant construction. I have a patch to
avoid 4096 for certain ranges  [-4096,-2049] or [2048,4094] (cribbed from
llvm).
e.g. 2064 = 2047 + 17 and we could potentially "spread" the 2 parts over 2 adds
to SP, avoiding the LUI. However if a const costs more than 1 insn, gcc wants
to force it in a register rather than split the add operation into 2 adds with
the split constants.

expand_binop
  expand_binop_directly
   avoid_expensive_constant

/* X is to be used in mode MODE as operand OPN to BINOPTAB.  If we're
   optimizing, and if the operand is a constant that costs more than
   1 instruction, force the constant into a register and return that
   register.  Return X otherwise.  UNSIGNEDP says whether X is unsigned.  */

[Bug fortran/112873] F2023 degree trig functions

2023-12-15 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112873

--- Comment #33 from anlauf at gcc dot gnu.org ---
(In reply to Jerry DeLisle from comment #32)
> commit a1f0d227481fe143f8c15b3f268e2d5964a3c90a (HEAD -> master,
> origin/master, origin/HEAD)
> Author: Jerry DeLisle 
> Date:   Fri Dec 15 13:05:18 2023 -0800
> 
> fortran: Update degree trigs documentation.
> 
> This is only some cleanup.
> 
> gcc/fortran/ChangeLog:
> 
> PR fortran/112783
> 
> * intrinsic.texi: Fix where no COMPLEX allowed.
> * invoke.texi: Clarify -fdev-math.
> 
> I fat fingered the PR number, sigh.

You can always --amend before pushing.

Re: [PATCH] RISC-V: Don't make Ztso imply A

2023-12-15 Thread Jeff Law




On 12/12/23 20:54, Palmer Dabbelt wrote:

I can't actually find anything in the ISA manual that makes Ztso imply
A.  In theory the memory ordering is just a different thing that the set
of availiable instructions (ie, Ztso without A would still imply TSO for
loads and stores).  It also seems like a configuration that could be
sane to build: without A it's all but impossible to write any meaningful
multi-core code, and TSO is really cheap for a single core.

That said, I think it's kind of reasonable to provide A to users asking
for Ztso.  So maybe even if this was a mistake it's the right thing to
do?

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc (riscv_implied_info):
Remove {"ztso", "a"}.
I'd tend to think step #1 is to determine what the ISA intent is, 
meaning engagement with RVI.


We've got time for that engagement and to adjust based on the result. 
So I'd tend to defer until we know if Ztso should imply A or not.


jeff


[Bug target/110201] RISC-V: __builtin_riscv_sm4ks and __builtin_riscv_sm4ed produce invalid assembly

2023-12-15 Thread law at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110201

Jeffrey A. Law  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #9 from Jeffrey A. Law  ---
Should (finally) be fixed on the trunk.

Re: [PR target/110201] Fix operand types for various scalar crypto insns

2023-12-15 Thread Jeff Law




On 12/14/23 17:14, Christoph Müllner wrote:

On Fri, Dec 15, 2023 at 12:36 AM Jeff Law  wrote:




On 12/14/23 02:46, Christoph Müllner wrote:

On Tue, Jun 20, 2023 at 12:34 AM Jeff Law via Gcc-patches
 wrote:



A handful of the scalar crypto instructions are supposed to take a
constant integer argument 0..3 inclusive.  A suitable constraint was
created and used for this purpose (D03), but the operand's predicate is
"register_operand".  That's just wrong.

This patch adds a new predicate "const_0_3_operand" and fixes the
relevant insns to use it.  One could argue the constraint is redundant
now (and you'd be correct).  I wouldn't lose sleep if someone wanted
that removed, in which case I'll spin up a V2.

The testsuite was broken in a way that made it consistent with the
compiler, so the tests passed, when they really should have been issuing
errors all along.

This patch adjusts the existing tests so that they all expect a
diagnostic on the invalid operand usage (including out of range
constants).  It adds new tests with proper constants, testing the
extremes of valid values.

OK for the trunk, or should we remove the D03 constraint?


Reviewed-by: Christoph Muellner 

The patch does not apply cleanly anymore, because there were some
small changes in crypto.md.

Here's an update to that old patch that also takes care of the pattern
where we allow 0..10 inclusive, but not registers.

Regression tested on rv64gc without new failures.  It'll need a
ChangeLog when approved, but that's easy to adjust.


Looks good and tests pass for rv64gc and rv32gc.

Reviewed-by: Christoph Muellner 
Tested-by: Christoph Muellner 

I've pushed this to the trunk with Liao listed as a co-author.

jeff


[Bug preprocessor/105608] [11/12/13/14 Regression] ICE: in linemap_add with a really long defined macro on the command line r11-338-g2a0225e47868fbfc

2023-12-15 Thread lhyatt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105608

Lewis Hyatt  changed:

   What|Removed |Added

   Keywords||patch
URL||https://gcc.gnu.org/piperma
   ||il/gcc-patches/2023-Decembe
   ||r/639467.html

--- Comment #4 from Lewis Hyatt  ---
I submitted the simple patch to resolve the ICE here:
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639467.html

[Bug target/110201] RISC-V: __builtin_riscv_sm4ks and __builtin_riscv_sm4ed produce invalid assembly

2023-12-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110201

--- Comment #8 from GCC Commits  ---
The master branch has been updated by Jeff Law :

https://gcc.gnu.org/g:082835836cf763e74ed2fdf9645ac4f1f94d6d4f

commit r14-6607-g082835836cf763e74ed2fdf9645ac4f1f94d6d4f
Author: Jeff Law 
Date:   Fri Dec 15 14:19:25 2023 -0700

Re: [PATCH] RISC-V: fix scalar crypto patterns

A handful of the scalar crypto instructions are supposed to take a
constant integer argument 0..3 inclusive and one should accept 0..10.
A suitable constraint was created and used for this purpose (D03 and DsA),
but the operand's predicate is "register_operand".  That's just wrong.

This patch adds a new predicates "const_0_3_operand" and
"const_0_10_operand"
and fixes the relevant insns to use the appropriate predicate.   It drops
the
now unnecessary constraints.

The testsuite was broken in a way that made it consistent with the
compiler, so the tests passed, when they really should have been issuing
errors all along.

This patch adjusts the existing tests so that they all expect a
diagnostic on the invalid operand usage (including out of range
constants).  It adds new tests with proper constants, testing the
extremes of valid values.

PR target/110201

gcc/

* config/riscv/constraints.md (D03, DsA): Remove unused
constraints.
* config/riscv/predicates.md (const_0_3_operand): New predicate.
(const_0_10_operand): Likewise.
* config/riscv/crypto.md (riscv_aes32dsi): Use new predicate.  Drop
unnecessary constraint.
(riscv_aes32dsmi, riscv_aes64im, riscv_aes32esi): Likewise.
(riscv_aes32esmi, *riscv__si): Likewise.
(riscv__di_extend, riscv__si): Likewise.

gcc/testsuite
* gcc.target/riscv/zknd32.c: Verify diagnostics are issued for
invalid builtin arguments.
* gcc.target/riscv/zknd64.c: Likewise.
* gcc.target/riscv/zkne32.c: Likewise.
* gcc.target/riscv/zkne64.c: Likewise.
* gcc.target/riscv/zksed32.c: Likewise.
* gcc.target/riscv/zksed64.c: Likewise.
* gcc.target/riscv/zknd32-2.c: New test
* gcc.target/riscv/zknd64-2.c: Likewise.
* gcc.target/riscv/zkne32-2.c: Likewise.
* gcc.target/riscv/zkne64-2.c: Likewise.
* gcc.target/riscv/zksed32-2.c: Likewise.
* gcc.target/riscv/zksed64-2.c: Likewise.

Co-authored-by: Liao Shihua 

[Bug ipa/112783] core dump on libxo when function is inlined

2023-12-15 Thread jvdelisle at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112783

Jerry DeLisle  changed:

   What|Removed |Added

 CC||jvdelisle at gcc dot gnu.org

--- Comment #8 from Jerry DeLisle  ---
(In reply to GCC Commits from comment #7)
> The master branch has been updated by Jerry DeLisle :
> 
> https://gcc.gnu.org/g:a1f0d227481fe143f8c15b3f268e2d5964a3c90a
> 
> commit r14-6606-ga1f0d227481fe143f8c15b3f268e2d5964a3c90a
> Author: Jerry DeLisle 
> Date:   Fri Dec 15 13:05:18 2023 -0800
> 
> fortran: Update degree trigs documentation.
> 
> This is only some cleanup.
> 
> gcc/fortran/ChangeLog:
> 
> PR fortran/112783
> 
> * intrinsic.texi: Fix where no COMPLEX allowed.
> * invoke.texi: Clarify -fdev-math.

Typoe the PR number. Ignore should be 112873

[Bug fortran/112873] F2023 degree trig functions

2023-12-15 Thread jvdelisle at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112873

--- Comment #32 from Jerry DeLisle  ---
commit a1f0d227481fe143f8c15b3f268e2d5964a3c90a (HEAD -> master, origin/master,
origin/HEAD)
Author: Jerry DeLisle 
Date:   Fri Dec 15 13:05:18 2023 -0800

fortran: Update degree trigs documentation.

This is only some cleanup.

gcc/fortran/ChangeLog:

PR fortran/112783

* intrinsic.texi: Fix where no COMPLEX allowed.
* invoke.texi: Clarify -fdev-math.

I fat fingered the PR number, sigh.

[Bug ipa/112783] core dump on libxo when function is inlined

2023-12-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112783

--- Comment #7 from GCC Commits  ---
The master branch has been updated by Jerry DeLisle :

https://gcc.gnu.org/g:a1f0d227481fe143f8c15b3f268e2d5964a3c90a

commit r14-6606-ga1f0d227481fe143f8c15b3f268e2d5964a3c90a
Author: Jerry DeLisle 
Date:   Fri Dec 15 13:05:18 2023 -0800

fortran: Update degree trigs documentation.

This is only some cleanup.

gcc/fortran/ChangeLog:

PR fortran/112783

* intrinsic.texi: Fix where no COMPLEX allowed.
* invoke.texi: Clarify -fdev-math.

[Bug analyzer/112792] -Wanalyzer-out-of-bounds false positives seen on Linux kernel with certain unions

2023-12-15 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112792

David Malcolm  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1
   Last reconfirmed||2023-12-15

--- Comment #1 from David Malcolm  ---
Am testing a fix.

[Bug target/112944] AVR: Support .rodata in Flash for Devices with FLMAP

2023-12-15 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112944

Georg-Johann Lay  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2023-12-15
 Status|UNCONFIRMED |ASSIGNED

[Bug target/112918] [m68k] [LRA] ICE: maximum number of generated reload insns per insn achieved (90)

2023-12-15 Thread vmakarov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112918

--- Comment #12 from Vladimir Makarov  ---
I've been working on the PR this week.  The problem for this case is in that
for subreg reload LRA can not narrow reg class more from ALL_REGS to
GENERAL_REGS and then to data regs or address regs.

The patch will be ready today but I am going to test it well and submit it on
Monday as it changes a sensitive part of LRA and might be risky.

Re: [PATCH v4 10/11] aarch64: Add new load/store pair fusion pass

2023-12-15 Thread Alex Coplan
On 15/12/2023 15:34, Richard Sandiford wrote:
> Alex Coplan  writes:
> > This is a v6 of the aarch64 load/store pair fusion pass, which
> > addresses the feedback from Richard's last review here:
> >
> > https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640539.html
> >
> > In particular this version implements the suggested changes which
> > greatly simplify the double list walk.
> >
> > Bootstrapped/regtested as a series on aarch64-linux-gnu, OK for trunk?
> >
> > Thanks,
> > Alex
> >
> > -- >8 --
> >
> > This adds a new aarch64-specific RTL-SSA pass dedicated to forming load
> > and store pairs (LDPs and STPs).
> >
> > As a motivating example for the kind of thing this improves, take the
> > following testcase:
> >
> > extern double c[20];
> >
> > double f(double x)
> > {
> >   double y = x*x;
> >   y += c[16];
> >   y += c[17];
> >   y += c[18];
> >   y += c[19];
> >   return y;
> > }
> >
> > for which we currently generate (at -O2):
> >
> > f:
> > adrpx0, c
> > add x0, x0, :lo12:c
> > ldp d31, d29, [x0, 128]
> > ldr d30, [x0, 144]
> > fmadd   d0, d0, d0, d31
> > ldr d31, [x0, 152]
> > faddd0, d0, d29
> > faddd0, d0, d30
> > faddd0, d0, d31
> > ret
> >
> > but with the pass, we generate:
> >
> > f:
> > .LFB0:
> > adrpx0, c
> > add x0, x0, :lo12:c
> > ldp d31, d29, [x0, 128]
> > fmadd   d0, d0, d0, d31
> > ldp d30, d31, [x0, 144]
> > faddd0, d0, d29
> > faddd0, d0, d30
> > faddd0, d0, d31
> > ret
> >
> > The pass is local (only considers a BB at a time).  In theory, it should
> > be possible to extend it to run over EBBs, at least in the case of pure
> > (MEM_READONLY_P) loads, but this is left for future work.
> >
> > The pass works by identifying two kinds of bases: tree decls obtained
> > via MEM_EXPR, and RTL register bases in the form of RTL-SSA def_infos.
> > If a candidate memory access has a MEM_EXPR base, then we track it via
> > this base, and otherwise if it is of a simple reg +  form, we track
> > it via the RTL-SSA def_info for the register.
> >
> > For each BB, for a given kind of base, we build up a hash table mapping
> > the base to an access_group.  The access_group data structure holds a
> > list of accesses at each offset relative to the same base.  It uses a
> > splay tree to support efficient insertion (while walking the bb), and
> > the nodes are chained using a linked list to support efficient
> > iteration (while doing the transformation).
> >
> > For each base, we then iterate over the access_group to identify
> > adjacent accesses, and try to form load/store pairs for those insns that
> > access adjacent memory.
> >
> > The pass is currently run twice, both before and after register
> > allocation.  The first copy of the pass is run late in the pre-RA RTL
> > pipeline, immediately after sched1, since it was found that sched1 was
> > increasing register pressure when the pass was run before.  The second
> > copy of the pass runs immediately before peephole2, so as to get any
> > opportunities that the existing ldp/stp peepholes can handle.
> >
> > There are some cases that we punt on before RA, e.g.
> > accesses relative to eliminable regs (such as the soft frame pointer).
> > We do this since we can't know the elimination offset before RA, and we
> > want to avoid the RA reloading the offset (due to being out of ldp/stp
> > immediate range) as this can generate worse code.
> >
> > The post-RA copy of the pass is there to pick up the crumbs that were
> > left behind / things we punted on in the pre-RA pass.  Among other
> > things, it's needed to handle accesses relative to the stack pointer.
> > It can also handle code that didn't exist at the time the pre-RA pass
> > was run (spill code, prologue/epilogue code).
> >
> > This is an initial implementation, and there are (among other possible
> > improvements) the following notable caveats / missing features that are
> > left for future work, but could give further improvements:
> >
> >  - Moving accesses between BBs within in an EBB, see above.
> >  - Out-of-range opportunities: currently the pass refuses to form pairs
> >if there isn't a suitable base register with an immediate in range
> >for ldp/stp, but it can be profitable to emit anchor addresses in the
> >case that there are four or more out-of-range nearby accesses that can
> >be formed into pairs.  This is handled by the current ldp/stp
> >peepholes, so it would be good to support this in the future.
> >  - Discovery: currently we prioritize MEM_EXPR bases over RTL bases, which 
> > can
> >lead to us missing opportunities in the case that two accesses have 
> > distinct
> >MEM_EXPR bases (i.e. different DECLs) but they are still adjacent in 
> > memory
> >(e.g. adjacent variables on the stack).  I hope to address this for 

[Bug target/109796] 548.exchange2_r compiled with -O2 -flto regressed by 5% on aarch64 between r14-135-gd06e9264b0192c and r14-192-g636e2273aec555

2023-12-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109796

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #1 from Andrew Pinski  ---
Looks like this has been resolved.

[Bug middle-end/26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

2023-12-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
Bug 26163 depends on bug 109796, which changed state.

Bug 109796 Summary: 548.exchange2_r compiled with -O2 -flto regressed by 5% on 
aarch64 between r14-135-gd06e9264b0192c and r14-192-g636e2273aec555
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109796

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

Request for Direction.

2023-12-15 Thread David H. Lynch Jr. via Gcc
I am part of a project developing content addressable memory.  I am the
2nd author for a paper on this presented at MEMSYS 2023, and with
additions likely to be accepted by ACM shortly. 
https://www.memsys.io/wp-content/uploads/2023/09/10.pdf

My role is to develop software to demonstrate the benefits of the
hardware/memory. A part of that is implimenting language extensions to
provide native support for content addressible memory.  
And then to modify some applications to utilize those extensions and
demonstrate the value. 

We have already developed a C/C++ preprocessor, that is mostly
functional,  but are looking to move to altering some actual compilers.

At this time this work is purely proof of the value proposition to
content addressible memory. Presuming that our work proves valuable, 
that will provide an impetus for further works. 

Right now I am just focused on some means to deliver support. 

So I am looking for direction regarding how to easily extend gcc to
provide support for content addressible memory.  

Basically I need to be able to tag variables as Content addressable,
rather than normally addressed, and then change code generation for CA
variables such that they reference memory by key rather than address. 

Is there a guide anywhere to developing language extensions for GCC
and/or making changes to code generation ?

I am a competent embedded software developer, with some ancient
experience with compilers, but starting from scratch with GCC. 
Pointers would be appreicated.  Help would be appreciated. While I am
leading this part of the project, there is some funding available for
assistance. 

Some recent languages have some form of content based addressing, but
this is implimented by the CPU.  We have altered the address logic of
memory to alter the way an "address" is handled ushc that it can
function as a key rather than a traditional linear address. 

We have demonstrated Sort in Memory with relatively simple changes to
memory addressing logic, and we have extended the addressing
capabilities to things like sparse array notation which has
applications to AI. 

We are not looking to feed anything into the GCC distribution. 
But the software  will be open source. 



  








 





  


[Bug c++/113021] [constexpr] gcc rejects initializing struct containing vector during constant evaluation depending if the struct also contains other member

2023-12-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113021

--- Comment #4 from Andrew Pinski  ---
Looks like it is the uninitialized field pointer which is causing the issues.
Looks like GCC does not currently fully support that for const variables inside
a constexpr/consteval . The linked PRs are about string but I think the same
issue shows up.

[Bug target/113009] [14] RISC-V: gcc.c-torture/unsorted/dump-noaddr.c flakey tests

2023-12-15 Thread ewlu at rivosinc dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113009

Edwin Lu  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from Edwin Lu  ---
(In reply to JuzheZhong from comment #5)
> I don't see this dump FAIL in the trunk now.
> 
> Could you confirm it ?

Ran 1000 times on r14-6596-g7d7a480eedf and don't see it anymore. Thanks!

Re: [PATCH] RISC-V: Add Zvfbfmin extension to the -march= option

2023-12-15 Thread Jeff Law




On 12/12/23 20:24, Xiao Zeng wrote:

This patch would like to add new sub extension (aka Zvfbfmin) to the
-march= option. It introduces a new data type BF16.

Depending on different usage scenarios, the Zvfbfmin extension may
depend on 'V' or 'Zve32f'. This patch only implements dependencies
in scenario of Embedded Processor. In scenario of Application
Processor, it is necessary to explicitly indicate the dependent
'V' extension.

You can locate more information about Zvfbfmin from below spec doc.

https://github.com/riscv/riscv-bfloat16/releases/download/20231027/riscv-bfloat16.pdf

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc:
(riscv_implied_info): Add zvfbfmin item.
 (riscv_ext_version_table): Ditto.
 (riscv_ext_flag_table): Ditto.
* config/riscv/riscv.opt:
(MASK_ZVFBFMIN): New macro.
(MASK_VECTOR_ELEN_BF_16): Ditto.
(TARGET_ZVFBFMIN): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/arch-31.c: New test.
* gcc.target/riscv/arch-32.c: New test.
* gcc.target/riscv/predef-32.c: New test.
* gcc.target/riscv/predef-33.c: New test.
I fixed the trivial whitespace issue with the ChangeLog and pushed this 
to the trunk.  However, I do want to stress that all future 
contributions need to indicate that the patch was successfully 
regression tested.


jeff


Re: [PATCH v7] libgfortran: Replace mutex with rwlock

2023-12-15 Thread Richard Earnshaw




On 15/12/2023 11:31, Lipeng Zhu wrote:



On 2023/12/14 23:50, Richard Earnshaw (lists) wrote:

On 09/12/2023 15:39, Lipeng Zhu wrote:

This patch try to introduce the rwlock and split the read/write to
unit_root tree and unit_cache with rwlock instead of the mutex to
increase CPU efficiency. In the get_gfc_unit function, the percentage
to step into the insert_unit function is around 30%, in most instances,
we can get the unit in the phase of reading the unit_cache or unit_root
tree. So split the read/write phase by rwlock would be an approach to
make it more parallel.

BTW, the IPC metrics can gain around 9x in our test
server with 220 cores. The benchmark we used is
https://github.com/rwesson/NEAT

libgcc/ChangeLog:

* gthr-posix.h (__GTHREAD_RWLOCK_INIT): New macro.
(__gthrw): New function.
(__gthread_rwlock_rdlock): New function.
(__gthread_rwlock_tryrdlock): New function.
(__gthread_rwlock_wrlock): New function.
(__gthread_rwlock_trywrlock): New function.
(__gthread_rwlock_unlock): New function.

libgfortran/ChangeLog:

* io/async.c (DEBUG_LINE): New macro.
* io/async.h (RWLOCK_DEBUG_ADD): New macro.
(CHECK_RDLOCK): New macro.
(CHECK_WRLOCK): New macro.
(TAIL_RWLOCK_DEBUG_QUEUE): New macro.
(IN_RWLOCK_DEBUG_QUEUE): New macro.
(RDLOCK): New macro.
(WRLOCK): New macro.
(RWUNLOCK): New macro.
(RD_TO_WRLOCK): New macro.
(INTERN_RDLOCK): New macro.
(INTERN_WRLOCK): New macro.
(INTERN_RWUNLOCK): New macro.
* io/io.h (struct gfc_unit): Change UNIT_LOCK to UNIT_RWLOCK in
a comment.
(unit_lock): Remove including associated internal_proto.
(unit_rwlock): New declarations including associated internal_proto.
(dec_waiting_unlocked): Use WRLOCK and RWUNLOCK on unit_rwlock
instead of __gthread_mutex_lock and __gthread_mutex_unlock on
unit_lock.
* io/transfer.c (st_read_done_worker): Use WRLOCK and RWUNLOCK on
unit_rwlock instead of LOCK and UNLOCK on unit_lock.
(st_write_done_worker): Likewise.
* io/unit.c: Change UNIT_LOCK to UNIT_RWLOCK in 'IO locking rules'
comment. Use unit_rwlock variable instead of unit_lock variable.
(get_gfc_unit_from_unit_root): New function.
(get_gfc_unit): Use RDLOCK, WRLOCK and RWUNLOCK on unit_rwlock
instead of LOCK and UNLOCK on unit_lock.
(close_unit_1): Use WRLOCK and RWUNLOCK on unit_rwlock instead of
LOCK and UNLOCK on unit_lock.
(close_units): Likewise.
(newunit_alloc): Use RWUNLOCK on unit_rwlock instead of UNLOCK on
unit_lock.
* io/unix.c (find_file): Use RDLOCK and RWUNLOCK on unit_rwlock
instead of LOCK and UNLOCK on unit_lock.
(flush_all_units): Use WRLOCK and RWUNLOCK on unit_rwlock instead
of LOCK and UNLOCK on unit_lock.



It looks like this has broken builds on arm-none-eabi when using newlib:

In file included from 
/work/rearnsha/gnusrc/nightly/gcc-cross/master/libgfortran

/runtime/error.c:27:
/work/rearnsha/gnusrc/nightly/gcc-cross/master/libgfortran/io/io.h: In 
function

‘dec_waiting_unlocked’:
/work/rearnsha/gnusrc/nightly/gcc-cross/master/libgfortran/io/io.h:1023:3: error
: implicit declaration of function ‘WRLOCK’ 
[-Wimplicit-function-declaration]

  1023 |   WRLOCK (_rwlock);
   |   ^~
/work/rearnsha/gnusrc/nightly/gcc-cross/master/libgfortran/io/io.h:1025:3: error
: implicit declaration of function ‘RWUNLOCK’ 
[-Wimplicit-function-declaration]

  1025 |   RWUNLOCK (_rwlock);
   |   ^~~~


R.


Hi Richard,

The root cause is that the macro WRLOCK and RWUNLOCK are not defined in 
io.h. The reason of x86 platform not failed is that 
HAVE_ATOMIC_FETCH_ADD is defined then caused above macros were never 
been used. Code logic show as below:

#ifdef HAVE_ATOMIC_FETCH_ADD
   (void) __atomic_fetch_add (>waiting, -1, __ATOMIC_RELAXED);
#else
   WRLOCK (_rwlock);
   u->waiting--;
   RWUNLOCK (_rwlock);
#endif

I just draft a patch try to fix this bug, because I didn't have arm 
platform, would you help to validate if it was fixed on arm platform?


diff --git a/libgfortran/io/io.h b/libgfortran/io/io.h
index 15daa0995b1..c7f0f7d7d9e 100644
--- a/libgfortran/io/io.h
+++ b/libgfortran/io/io.h
@@ -1020,9 +1020,15 @@ dec_waiting_unlocked (gfc_unit *u)
  #ifdef HAVE_ATOMIC_FETCH_ADD
    (void) __atomic_fetch_add (>waiting, -1, __ATOMIC_RELAXED);
  #else
-  WRLOCK (_rwlock);
+#ifdef __GTHREAD_RWLOCK_INIT
+  __gthread_rwlock_wrlock (_rwlock);
+  u->waiting--;
+  __gthread_rwlock_unlock (_rwlock);
+#else
+  __gthread_mutex_lock (_rwlock);
    u->waiting--;
-  RWUNLOCK (_rwlock);
+  __gthread_mutex_unlock (_rwlock);
+#endif
  #endif
  }


Lipeng Zhu


Hi Lipeng,

Thanks for the quick reply.  I can confirm that with the above change 
the bootstrap failure is fixed.  However, this shouldn't be considered a 
formal review; libgfortran is not really my area.


I'll be away now until January 2nd.

Richard.


Re: [PATCH 2/2] c++: partial ordering and dep alias tmpl specs [PR90679]

2023-12-15 Thread Patrick Palka
On Thu, 1 Jun 2023, Patrick Palka wrote:

> During partial ordering, we want to look through dependent alias
> template specializations within template arguments and otherwise
> treat them as opaque in other contexts (see e.g. r7-7116-g0c942f3edab108
> and r11-7011-g6e0a231a4aa240).  To that end template_args_equal was
> given a partial_order flag that controls this behavior.  This flag
> does the right thing when a dependent alias template specialization
> appears as template argument of the partial specialization, e.g. in
> 
>   template using first_t = T;
>   template struct traits;
>   template struct traits> { }; // #1
>   template struct traits> { }; // #2
> 
> we correctly consider #2 to be more specialized than #1.  But if
> the alias specialization appears as a template argument of another
> class template specialization, e.g. in
> 
>   template struct traits>> { }; // #1
>   template struct traits>> { }; // #2
> 
> then we incorrectly consider #1 and #2 to be unordered.  This is because
> 
>   1. we don't propagate the flag to recursive template_args_equal calls
>   2. we don't use structural equality for class template specializations
>  written in terms of dependent alias template specializations
> 
> This patch fixes the first issue by turning the partial_order flag into
> a global.  This patch fixes the second issue by making us propagate
> structural equality appropriately when building a class template
> specialization.  In passing this patch also improves hashing of
> specializations that use structural equality.
> 
> Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> trunk?
> 
>   PR c++/90679
> 
> gcc/cp/ChangeLog:
> 
>   * cp-tree.h (comp_template_args): Remove partial_order
>   parameter.
>   (template_args_equal): Likewise.
>   * pt.cc (iterative_hash_template_arg) : Hash
>   the template and arguments for specializations that use
>   structural equality.
>   (comparing_for_partial_ordering): New flag.
>   (template_args_equal): Remove partial order parameter and
>   use comparing_for_partial_ordering instead.
>   (comp_template_args): Likewise.
>   (comp_template_args_porder): Set comparing_for_partial_ordering
>   instead.  Make static.
>   (any_template_arguments_need_structural_equality_p): Return true
>   for an argument that's a dependent alias template specialization
>   or a class template specialization that itself needs structural
>   equality.
>   * tree.cc (cp_tree_equal) : Adjust call to
>   comp_template_args.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/cpp0x/alias-decl-75a.C: New test.
>   * g++.dg/cpp0x/alias-decl-75b.C: New test.

Ping.  Here's the rebased patch:

-- >8 --

Subject: [PATCH 2/2] c++: partial ordering and dep alias tmpl specs [PR90679]

During partial ordering, we want to look through dependent alias
template specializations within template arguments and otherwise
treat them as opaque in other contexts (see e.g. r7-7116-g0c942f3edab108
and r11-7011-g6e0a231a4aa240).  To that end template_args_equal was
given a partial_order flag that controls this behavior.  This flag
does the right thing when a dependent alias template specialization
appears as template argument of the partial specialization, e.g. in

  template using first_t = T;
  template struct traits;
  template struct traits> { }; // #1
  template struct traits> { }; // #2

we correctly consider #2 to be more specialized than #1.  But if
the alias specialization appears as a template argument of another
class template specialization, e.g. in

  template struct traits>> { }; // #1
  template struct traits>> { }; // #2

then we incorrectly consider #1 and #2 to be unordered.  This is because

  1. we don't propagate the flag to recursive template_args_equal calls
  2. we don't use structural equality for class template specializations
 written in terms of dependent alias template specializations

This patch fixes the first issue by turning the partial_order flag into
a global.  This patch fixes the second issue by making us propagate
structural equality appropriately when building a class template
specialization.  In passing this patch also improves hashing of
specializations that use structural equality.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

PR c++/90679

gcc/cp/ChangeLog:

* cp-tree.h (comp_template_args): Remove partial_order
parameter.
(template_args_equal): Likewise.
* pt.cc (iterative_hash_template_arg) : Hash
the template and arguments for specializations that use
structural equality.
(comparing_for_partial_ordering): New flag.
(template_args_equal): Remove partial order parameter and
use comparing_for_partial_ordering instead.
(comp_template_args): Likewise.
(comp_template_args_porder): Set comparing_for_partial_ordering

Re: [PATCH 1/2] c++: refine dependent_alias_template_spec_p [PR90679]

2023-12-15 Thread Patrick Palka
On Mon, 11 Sep 2023, Patrick Palka wrote:

> On Thu, 1 Jun 2023, Patrick Palka wrote:
> 
> > For a complex alias template-id, dependent_alias_template_spec_p returns
> > true if any template argument of the template-id is dependent.  This
> > predicate indicates that substitution into the template-id may behave
> > differently with respect to SFINAE than substitution into the expanded
> > alias, and so the alias is in a way non-transparent.  For example
> > 'first_t' in
> > 
> >   template using first_t = T;
> >   template first_t f();
> > 
> > is such an alias template-id since first_t doesn't use its second
> > template parameter and so the substitution into the expanded alias would
> > discard the SFINAE effects of the corresponding (dependent) argument 'T&'.
> > 
> > But this predicate is overly conservative since what really matters for
> > sake of SFINAE equivalence is whether a template argument corresponding
> > to an _unused_ template parameter is dependent.  So the predicate should
> > return false for e.g. 'first_t' or 'first_t'.
> > 
> > This patch refines the predicate appropriately.  We need to be able to
> > efficiently determine which template parameters of a complex alias
> > template are unused, so to that end we add a new out parameter to
> > complex_alias_template_p and cache its result in an on-the-side
> > hash_map that replaces the existing TEMPLATE_DECL_COMPLEX_ALIAS_P
> > flag.  And in doing so, we fix a latent bug that this flag wasn't
> > being propagated during partial instantiation, and so we were treating
> > all partially instantiated member alias templates as non-complex.
> > 
> > PR c++/90679
> > 
> > gcc/cp/ChangeLog:
> > 
> > * cp-tree.h (TEMPLATE_DECL_COMPLEX_ALIAS_P): Remove.
> > (most_general_template): Constify parameter.
> > * pt.cc (push_template_decl): Adjust after removing
> > TEMPLATE_DECL_COMPLEX_ALIAS_P.
> > (complex_alias_tmpl_info): New hash_map.
> > (uses_all_template_parms_data::seen): Change type to
> > tree* from bool*.
> > (complex_alias_template_r): Adjust accordingly.
> > (complex_alias_template_p): Add 'seen_out' out parameter.
> > Call most_general_template and check PRIMARY_TEMPLATE_P.
> > Use complex_alias_tmpl_info to cache the result and set
> > '*seen_out' accordigly.
> > (dependent_alias_template_spec_p): Add !processing_template_decl
> > early exit test.  Consider dependence of only template arguments
> > corresponding to seen template parameters as per
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * g++.dg/cpp0x/alias-decl-75.C: New test.
> 
> Ping.

Ping.  Here's a rebased patch:

-- >8 --

Subject: [PATCH 1/2] c++: refine dependent_alias_template_spec_p [PR90679]

For a (complex) alias template-id, dependent_alias_template_spec_p
returns true if any template argument of the template-id is dependent.
This predicate indicates that substitution into the template-id may
behave differently with respect to SFINAE than substitution into the
expanded alias, and so the alias is in a way non-transparent.

For example, 'first_t' in

  template using first_t = T;
  template first_t f();

is such an alias template-id since first_t doesn't use its second
template parameter and so the substitution into the expanded alias would
discard the SFINAE effects of the corresponding (dependent) argument 'T&'.

But this predicate is overly conservative since what really matters for
sake of SFINAE equivalence is whether a template argument corresponding
to an _unused_ template parameter is dependent.  So the predicate should
return false for e.g. 'first_t'.

This patch refines the predicate appropriately.  We need to be able to
efficiently determine which template parameters of a complex alias
template are unused, so to that end we add a new out parameter to
complex_alias_template_p and cache its result in an on-the-side
hash_map that replaces the existing TEMPLATE_DECL_COMPLEX_ALIAS_P
flag.

PR c++/90679

gcc/cp/ChangeLog:

* cp-tree.h (TEMPLATE_DECL_COMPLEX_ALIAS_P): Remove.
(most_general_template): Constify parameter.
* pt.cc (push_template_decl): Adjust after removing
TEMPLATE_DECL_COMPLEX_ALIAS_P.
(complex_alias_tmpl_info): New hash_map.
(uses_all_template_parms_data::seen): Change type to
tree* from bool*.
(complex_alias_template_r): Adjust accordingly.
(complex_alias_template_p): Add 'seen_out' out parameter.
Call most_general_template and check PRIMARY_TEMPLATE_P.
Use complex_alias_tmpl_info to cache the result and set
'*seen_out' accordigly.
(dependent_alias_template_spec_p): Add !processing_template_decl
early exit test.  Consider dependence of only template arguments
corresponding to seen template parameters as per

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/alias-decl-76.C: New test.
---
 gcc/cp/cp-tree.h   |   7 +-
 gcc/cp/pt.cc  

[PATCH 3/3][RFC] RISC-V: Enable assert for insn_has_dfa_reservation

2023-12-15 Thread Edwin Lu
Enables assert that every typed instruction is associated with a
dfa reservation

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_sched_variable_issue): enable assert

Signed-off-by: Edwin Lu 
---
 gcc/config/riscv/riscv.cc | 2 --
 1 file changed, 2 deletions(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index ab0f95e5fe9..3adeb415bec 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -8048,9 +8048,7 @@ riscv_sched_variable_issue (FILE *, int, rtx_insn *insn, 
int more)
 
   /* If we ever encounter an insn without an insn reservation, trip
  an assert so we can find and fix this problem.  */
-#if 0
   gcc_assert (insn_has_dfa_reservation_p (insn));
-#endif
 
   return more - 1;
 }
-- 
2.34.1



[PATCH 2/3][RFC] RISC-V: Add vector related reservations

2023-12-15 Thread Edwin Lu
This patch copies the vector reservations from generic-ooo.md and
inserts them into generic.md and sifive.md. The vector pipelines are
necessary to avoid an ICE from the assert

gcc/ChangeLog:

* config/riscv/generic-ooo.md: syntax
* config/riscv/generic.md (pipe0): new reservation
(generic_vec_load): ditto
(generic_vec_store): ditto
(generic_vec_loadstore_seg): ditto
(generic_generic_vec_alu): ditto
(generic_vec_fcmp): ditto
(generic_vec_imul): ditto
(generic_vec_fadd): ditto
(generic_vec_fmul): ditto
(generic_crypto): ditto
(generic_vec_perm): ditto
(generic_vec_reduction): ditto
(generic_vec_ordered_reduction): ditto
(generic_vec_idiv): ditto
(generic_vec_float_divsqrt): ditto
(generic_vec_mask): ditto
(generic_vec_vesetvl): ditto
(generic_vec_setrm): ditto
(generic_vec_readlen): ditto
* config/riscv/sifive-7.md (sifive_7): new reservation
(sifive_7_vec_load): ditto
(sifive_7_vec_store): ditto
(sifive_7_vec_loadstore_seg): ditto
(sifive_7_sifive_7_vec_alu): ditto
(sifive_7_vec_fcmp): ditto
(sifive_7_vec_imul): ditto
(sifive_7_vec_fadd): ditto
(sifive_7_vec_fmul): ditto
(sifive_7_crypto): ditto
(sifive_7_vec_perm): ditto
(sifive_7_vec_reduction): ditto
(sifive_7_vec_ordered_reduction): ditto
(sifive_7_vec_idiv): ditto
(sifive_7_vec_float_divsqrt): ditto
(sifive_7_vec_mask): ditto
(sifive_7_vec_vesetvl): ditto
(sifive_7_vec_setrm): ditto
(sifive_7_vec_readlen): ditto

Signed-off-by: Edwin Lu 
Co-authored-by: Robin Dapp 
---
 gcc/config/riscv/generic-ooo.md |  19 ++---
 gcc/config/riscv/generic.md | 118 
 gcc/config/riscv/sifive-7.md| 118 
 3 files changed, 242 insertions(+), 13 deletions(-)

diff --git a/gcc/config/riscv/generic-ooo.md b/gcc/config/riscv/generic-ooo.md
index de93245f965..18b606bb981 100644
--- a/gcc/config/riscv/generic-ooo.md
+++ b/gcc/config/riscv/generic-ooo.md
@@ -106,16 +106,14 @@ (define_insn_reservation "generic_ooo_vec_store" 6
 ;; Vector segment loads/stores.
 (define_insn_reservation "generic_ooo_vec_loadstore_seg" 10
   (and (eq_attr "tune" "generic_ooo")
-   (eq_attr "type" "vlsegde,vlsegds,vlsegdux,vlsegdox,vlsegdff,\
-   vssegte,vssegts,vssegtux,vssegtox"))
+   (eq_attr "type" 
"vlsegde,vlsegds,vlsegdux,vlsegdox,vlsegdff,vssegte,vssegts,vssegtux,vssegtox"))
   "generic_ooo_vxu_issue,generic_ooo_vxu_alu")
 
 
 ;; Generic integer instructions.
 (define_insn_reservation "generic_ooo_alu" 1
   (and (eq_attr "tune" "generic_ooo")
-   (eq_attr "type" "unknown,const,arith,shift,slt,multi,auipc,nop,logical,\
-   
move,bitmanip,rotate,min,max,minu,maxu,clz,ctz,atomic,condmove,cbo,mvpair,zicond"))
+   (eq_attr "type" 
"unknown,const,arith,shift,slt,multi,auipc,nop,logical,move,bitmanip,rotate,min,max,minu,maxu,clz,ctz,atomic,condmove,cbo,mvpair,zicond"))
   "generic_ooo_issue,generic_ooo_ixu_alu")
 
 (define_insn_reservation "generic_ooo_sfb_alu" 2
@@ -193,16 +191,13 @@ (define_insn_reservation "generic_ooo_popcount" 2
 ;; Regular vector operations and integer comparisons.
 (define_insn_reservation "generic_ooo_vec_alu" 3
   (and (eq_attr "tune" "generic_ooo")
-   (eq_attr "type" 
"vialu,viwalu,vext,vicalu,vshift,vnshift,viminmax,vicmp,\
-   vimov,vsalu,vaalu,vsshift,vnclip,vmov,vfmov,vector"))
+   (eq_attr "type" 
"vialu,viwalu,vext,vicalu,vshift,vnshift,viminmax,vicmp,vimov,vsalu,vaalu,vsshift,vnclip,vmov,vfmov,vector"))
   "generic_ooo_vxu_issue,generic_ooo_vxu_alu")
 
 ;; Vector float comparison, conversion etc.
 (define_insn_reservation "generic_ooo_vec_fcmp" 3
   (and (eq_attr "tune" "generic_ooo")
-   (eq_attr "type" "vfrecp,vfminmax,vfcmp,vfsgnj,vfclass,vfcvtitof,\
-   vfcvtftoi,vfwcvtitof,vfwcvtftoi,vfwcvtftof,vfncvtitof,\
-   vfncvtftoi,vfncvtftof"))
+   (eq_attr "type" 
"vfrecp,vfminmax,vfcmp,vfsgnj,vfclass,vfcvtitof,vfcvtftoi,vfwcvtitof,vfwcvtftoi,vfwcvtftof,vfncvtitof,vfncvtftoi,vfncvtftof"))
   "generic_ooo_vxu_issue,generic_ooo_vxu_alu")
 
 ;; Vector integer multiplication.
@@ -232,8 +227,7 @@ (define_insn_reservation "generic_ooo_crypto" 4
 ;; Vector permute.
 (define_insn_reservation "generic_ooo_perm" 3
   (and (eq_attr "tune" "generic_ooo")
-   (eq_attr "type" "vimerge,vfmerge,vslideup,vslidedown,vislide1up,\
-   
vislide1down,vfslide1up,vfslide1down,vgather,vcompress"))
+   (eq_attr "type" 
"vimerge,vfmerge,vslideup,vslidedown,vislide1up,vislide1down,vfslide1up,vfslide1down,vgather,vcompress"))
   "generic_ooo_vxu_issue,generic_ooo_vxu_alu")
 
 ;; Vector reduction.
@@ -265,8 +259,7 @@ (define_insn_reservation 

[PATCH 1/3][RFC] RISC-V: Add non-vector types to pipelines

2023-12-15 Thread Edwin Lu
This patch does not create vector related insn reservations for
generic.md and sifive-7.md. It updates/creates insn reservations
for all non-vector typed insns

gcc/ChangeLog:

* config/riscv/generic-ooo.md (generic_ooo_sfb_alu): create/update 
reservation
(generic_ooo_branch): ditto
* config/riscv/generic.md (generic_sfb_alu): ditto
* config/riscv/sifive-7.md (sifive_7_popcount): ditto

Signed-off-by: Edwin Lu 
---
 gcc/config/riscv/generic-ooo.md | 16 +---
 gcc/config/riscv/generic.md | 13 +
 gcc/config/riscv/sifive-7.md| 12 +---
 3 files changed, 31 insertions(+), 10 deletions(-)

diff --git a/gcc/config/riscv/generic-ooo.md b/gcc/config/riscv/generic-ooo.md
index 78b9e48f935..de93245f965 100644
--- a/gcc/config/riscv/generic-ooo.md
+++ b/gcc/config/riscv/generic-ooo.md
@@ -95,7 +95,7 @@ (define_insn_reservation "generic_ooo_float_store" 6
 ;; Vector load/store
 (define_insn_reservation "generic_ooo_vec_load" 6
   (and (eq_attr "tune" "generic_ooo")
-   (eq_attr "type" "vlde,vldm,vlds,vldux,vldox,vldff,vldr"))
+   (eq_attr "type" "vlde,vldm,vlds,vldux,vldox,vldff,vldr,rdfrm"))
   "generic_ooo_vxu_issue,generic_ooo_vxu_alu")
 
 (define_insn_reservation "generic_ooo_vec_store" 6
@@ -115,9 +115,19 @@ (define_insn_reservation "generic_ooo_vec_loadstore_seg" 10
 (define_insn_reservation "generic_ooo_alu" 1
   (and (eq_attr "tune" "generic_ooo")
(eq_attr "type" "unknown,const,arith,shift,slt,multi,auipc,nop,logical,\
-   move,bitmanip,min,max,minu,maxu,clz,ctz"))
+   
move,bitmanip,rotate,min,max,minu,maxu,clz,ctz,atomic,condmove,cbo,mvpair,zicond"))
   "generic_ooo_issue,generic_ooo_ixu_alu")
 
+(define_insn_reservation "generic_ooo_sfb_alu" 2
+  (and (eq_attr "tune" "generic_ooo")
+   (eq_attr "type" "sfb_alu"))
+  "generic_ooo_issue,generic_ooo_ixu_alu")
+
+;; Branch instructions
+(define_insn_reservation "generic_ooo_branch" 1
+  (and (eq_attr "tune" "generic_ooo")
+   (eq_attr "type" "branch,jump,call,jalr,ret,trap,pushpop"))
+  "generic_ooo_issue,generic_ooo_ixu_alu")
 
 ;; Float move, convert and compare.
 (define_insn_reservation "generic_ooo_float_move" 3
@@ -184,7 +194,7 @@ (define_insn_reservation "generic_ooo_popcount" 2
 (define_insn_reservation "generic_ooo_vec_alu" 3
   (and (eq_attr "tune" "generic_ooo")
(eq_attr "type" 
"vialu,viwalu,vext,vicalu,vshift,vnshift,viminmax,vicmp,\
-   vimov,vsalu,vaalu,vsshift,vnclip,vmov,vfmov"))
+   vimov,vsalu,vaalu,vsshift,vnclip,vmov,vfmov,vector"))
   "generic_ooo_vxu_issue,generic_ooo_vxu_alu")
 
 ;; Vector float comparison, conversion etc.
diff --git a/gcc/config/riscv/generic.md b/gcc/config/riscv/generic.md
index 88940483829..3e49d942495 100644
--- a/gcc/config/riscv/generic.md
+++ b/gcc/config/riscv/generic.md
@@ -27,7 +27,7 @@ (define_cpu_unit "fdivsqrt" "pipe0")
 
 (define_insn_reservation "generic_alu" 1
   (and (eq_attr "tune" "generic")
-   (eq_attr "type" 
"unknown,const,arith,shift,slt,multi,auipc,nop,logical,move,bitmanip,min,max,minu,maxu,clz,ctz,cpop"))
+   (eq_attr "type" 
"unknown,const,arith,shift,slt,multi,auipc,nop,logical,move,bitmanip,min,max,minu,maxu,clz,ctz,rotate,atomic,condmove,crypto,mvpair,zicond"))
   "alu")
 
 (define_insn_reservation "generic_load" 3
@@ -42,17 +42,22 @@ (define_insn_reservation "generic_store" 1
 
 (define_insn_reservation "generic_xfer" 3
   (and (eq_attr "tune" "generic")
-   (eq_attr "type" "mfc,mtc,fcvt,fmove,fcmp"))
+   (eq_attr "type" "mfc,mtc,fcvt,fmove,fcmp,cbo"))
   "alu")
 
 (define_insn_reservation "generic_branch" 1
   (and (eq_attr "tune" "generic")
-   (eq_attr "type" "branch,jump,call,jalr"))
+   (eq_attr "type" "branch,jump,call,jalr,ret,trap,pushpop"))
+  "alu")
+
+(define_insn_reservation "generic_sfb_alu" 2
+  (and (eq_attr "tune" "generic")
+   (eq_attr "type" "sfb_alu"))
   "alu")
 
 (define_insn_reservation "generic_imul" 10
   (and (eq_attr "tune" "generic")
-   (eq_attr "type" "imul,clmul"))
+   (eq_attr "type" "imul,clmul,cpop"))
   "imuldiv*10")
 
 (define_insn_reservation "generic_idivsi" 34
diff --git a/gcc/config/riscv/sifive-7.md b/gcc/config/riscv/sifive-7.md
index a63394c8c58..65d27cf6dc9 100644
--- a/gcc/config/riscv/sifive-7.md
+++ b/gcc/config/riscv/sifive-7.md
@@ -34,7 +34,7 @@ (define_insn_reservation "sifive_7_fpstore" 1
 
 (define_insn_reservation "sifive_7_branch" 1
   (and (eq_attr "tune" "sifive_7")
-   (eq_attr "type" "branch"))
+   (eq_attr "type" "branch,ret,trap"))
   "sifive_7_B")
 
 (define_insn_reservation "sifive_7_sfb_alu" 2
@@ -44,7 +44,7 @@ (define_insn_reservation "sifive_7_sfb_alu" 2
 
 (define_insn_reservation "sifive_7_jump" 1
   (and (eq_attr "tune" "sifive_7")
-   (eq_attr "type" "jump,call,jalr"))
+   (eq_attr "type" "jump,call,jalr,pushpop"))
   "sifive_7_B")
 
 (define_insn_reservation "sifive_7_mul" 3
@@ -59,7 

[PATCH 0/3][RFC] RISC-V: Associate typed insns to dfa reservation

2023-12-15 Thread Edwin Lu
This series is a prototype for adding all typed instructions to a dfa 
scheduling pipeline.

I've been working on adding insn reservations for all typed instructions
to ensure all instructions are part of a dfa pipeline. I don't have a good 
understanding of vector instruction latency, so I have been struggling
with what I should do for those. 

As of right now, I have copied the insn reservations from generic-ooo.md 
for vector instructions into the generic.md and sifive-7.md files. This 
prevents ICEs from enabling the assert but introduces numerous scan
dump failures (when tested in linux rv64gcv and rv64gc_zba_zbb_zbc_zbs).

Currently, only patch 1/3 RISC-V: Add non-vector types to pipelines does
not introduce regressions (when tested against linux rv32/64 gc/gcv
on rocket).  I hope that the locations I added the insn types make sense. 
Please let me know if they should change.

The final patch enables the assert for insn_has_dfa_reservation. 

I tested the full patch series on both rocket and sifive-7-series. The 
series does introduce additional scan dump failures compared to their
respective baselines, however, I'm not sure how many failures were
due to the patch vs incorrect modeling assumptions. I created
PR113035 which has the full testsuite failures I saw (without the patches
applied).

Edwin Lu (3):
  RISC-V: Add non-vector types to pipelines
  RISC-V: Add vector related reservations
  RISC-V: Enable assert for insn_has_dfa_reservation

 gcc/config/riscv/generic-ooo.md |  31 
 gcc/config/riscv/generic.md | 131 +++-
 gcc/config/riscv/riscv.cc   |   2 -
 gcc/config/riscv/sifive-7.md| 130 ++-
 4 files changed, 271 insertions(+), 23 deletions(-)

-- 
2.34.1



[Bug ada/113037] New: GNAT BUG DETECTED when instantiating generic package with Type_Invariant on a type derived from a generic type

2023-12-15 Thread saulius.grazulis at bti dot vu.lt via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113037

Bug ID: 113037
   Summary: GNAT BUG DETECTED when instantiating generic package
with Type_Invariant on a type derived from a generic
type
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: ada
  Assignee: unassigned at gcc dot gnu.org
  Reporter: saulius.grazulis at bti dot vu.lt
CC: dkm at gcc dot gnu.org
  Target Milestone: ---

Created attachment 56890
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56890=edit
A Minimal Working Example, for 'gnatchop'

When a generic package with a Type_Invariant on a declared type is
instantiated, a GNAT BUG DETECTED box is triggered:

+ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/home/saulius/install/gcc/gcc-13.2.0-git-master-d1647917006/libexec/gcc/x86_64-pc-linux-gnu/14.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc/configure
--prefix=/home/saulius/install/gcc/gcc-13.2.0-git-master-d1647917006
--enable-languages=c,c++,ada,fortran --disable-multilib --enable-shared
--enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 14.0.0 20231214 (experimental) (GCC) 
+ uname -a
Linux tasmanijos-velnias 5.4.0-167-generic #184-Ubuntu SMP Tue Oct 31 09:21:49
UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
+ lsb_release -a
No LSB modules are available.
Distributor ID: Linuxmint
Description:Linux Mint 20.1
Release:20.1
Codename:   ulyssa
+ gnatmake extended_euklidean_gen
gcc -c extended_euklidean_gen.adb
+===GNAT BUG DETECTED==+
| 14.0.0 20231214 (experimental) (x86_64-pc-linux-gnu) Assert_Failure
exp_util.adb:3693|
| Error detected at extended_euklidean_generic.ads:15:4
[extended_euklidean_gen.adb:12:4]|
| Compiling extended_euklidean_gen.adb |
| Please submit a bug report; see https://gcc.gnu.org/bugs/ .  |
| Use a subject line meaningful to you and us to track the bug.|
| Include the entire contents of this bug box in the report.   |
| Include the exact command that you entered.  |
| Also include sources listed below.   |
+==+

Please include these source files with error report
Note that list may not be accurate in some cases,
so please double check that the problem can still
be reproduced with the set of files listed.
Consider also -gnatd.n switch (see debug.adb).

extended_euklidean_gen.adb
extended_euklidean_generic.ads

compilation abandoned
gnatmake: "extended_euklidean_gen.adb" compilation error

$ cat extended_euklidean_gen.adb
pragma Ada_2022;

with Ada.Numerics.Big_Numbers.Big_Integers;
use Ada.Numerics.Big_Numbers.Big_Integers;

with Extended_Euklidean_Generic;

procedure Extended_Euklidean_Gen is

   type Number is new Big_Integer;

   package Extended_Euklidean is 
  new Extended_Euklidean_Generic (Number, 0, 1);

begin
   null;
end Extended_Euklidean_Gen;

$ cat extended_euklidean_generic.ads
generic
   type Number is private;
   Zero : Number;
   Unity : Number;
   with function "-" (A, B : Number) return Number is <>;
   with function ">" (A, B : Number) return Boolean is <>;

package Extended_Euklidean_Generic is

   type Positive_Number is new Number
 with Type_Invariant => Number (Positive_Number) > Zero;

   procedure GCD
 ( 
  A, B : in  Positive_Number; -- original numbers
  D: out Positive_Number; -- GCD of the two numbers A and B
  M, N : out Number   -- Bézout coefficients: A * M + B * N = D
 );   

end Extended_Euklidean_Generic;

$ cat extended_euklidean_generic.adb
package body Extended_Euklidean_Generic is

   -- Test implementaion of the Extended Euclidean Algorithm.
   -- URL: https://en.wikipedia.org/wiki/Extended_Euclidean_algorithm

   procedure GCD
 ( 
  A, B : in  Positive_Number; -- original numbers
  D: out Positive_Number; -- GCD of the two numbers A and B
  M, N : out Number   -- Bézout coefficients: A * M + B * N = D
 ) is
  P : Number := Unity; -- X = P*A + Q*B at any point
  Q : Number := Zero;
  S : Number := Zero;  -- Y = S*A + T*B at any point
  T : Number := Unity;
  X : Positive_Number := A; 
  Y : Positive_Number := B;
   begin
  while X /= Y loop
 if X > Y then
X := X - Y;
P := P - S;
Q := Q - T;
 else
Y := Y - X;
S := S - P;
T := T - Q;
 end if;
  end loop;
  D := X;
  M := P;
  N := Q;
  -- The naive computation of the N value can overflow:
  -- N := (X - P*A) / B;
  pragma 

[Bug ada/113036] New: GNAT BUG DETECTED ICE box triggered by a default value that is a nested iterated aggregate

2023-12-15 Thread saulius.grazulis at bti dot vu.lt via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113036

Bug ID: 113036
   Summary: GNAT BUG DETECTED ICE box triggered by a default value
that is a nested iterated aggregate
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: ada
  Assignee: unassigned at gcc dot gnu.org
  Reporter: saulius.grazulis at bti dot vu.lt
CC: dkm at gcc dot gnu.org
  Target Milestone: ---

Created attachment 56889
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56889=edit
A Minimal Working Example

When a discriminated record type has an 2D array a as a component, and a
default initialization value with nested iterated arrays is provided, a GNAT
BUG DETECTED box is triggered:

+ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/home/saulius/install/gcc/gcc-13.2.0-git-master-d1647917006/libexec/gcc/x86_64-pc-linux-gnu/14.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc/configure
--prefix=/home/saulius/install/gcc/gcc-13.2.0-git-master-d1647917006
--enable-languages=c,c++,ada,fortran --disable-multilib --enable-shared
--enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 14.0.0 20231214 (experimental) (GCC) 

+ uname -a
Linux tasmanijos-velnias 5.4.0-167-generic #184-Ubuntu SMP Tue Oct 31 09:21:49
UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

+ lsb_release -a
No LSB modules are available.
Distributor ID: Linuxmint
Description:Linux Mint 20.1
Release:20.1
Codename:   ulyssa

+ gnatmake pm_type_default_not_ok.adb
gcc -c pm_type_default_not_ok.adb
+===GNAT BUG DETECTED==+
| 14.0.0 20231214 (experimental) (x86_64-pc-linux-gnu) Assert_Failure failed
precondition from sinfo-nodes.ads:3628|
| Error detected at pm_type_default_not_ok.adb:9:4 |
| Compiling pm_type_default_not_ok.adb |
| Please submit a bug report; see https://gcc.gnu.org/bugs/ .  |
| Use a subject line meaningful to you and us to track the bug.|
| Include the entire contents of this bug box in the report.   |
| Include the exact command that you entered.  |
| Also include sources listed below.   |
+==+

Please include these source files with error report
Note that list may not be accurate in some cases,
so please double check that the problem can still
be reproduced with the set of files listed.
Consider also -gnatd.n switch (see debug.adb).

pm_type_default_not_ok.adb

compilation abandoned
gnatmake: "pm_type_default_not_ok.adb" compilation error

$ cat pm_type_default_not_ok.adb 
pragma Ada_2022;

procedure PM_Type_Default_Not_OK is

   type One_Or_Zero is range 0 .. 1;

   type Permutation_Matrix_Array is array (Positive range <>, Positive range
<>) of One_Or_Zero;

   type Permutation_Matrix (N : Positive) is record
  A : Permutation_Matrix_Array (1 .. N, 1 .. N) :=
(for I in 1 .. N => (for J in 1 .. N => 0));
   end record;

begin
   null;
end;

Not nesting the value, (e.g. having (others => 0)), or using the same iterated
expression in an assignment (as opposed to initialization) works OK.

Re: [PATCH] match.pd: Optimize sign-extension followed by truncation [PR113024]

2023-12-15 Thread Richard Sandiford
Jakub Jelinek  writes:
> Hi!
>
> While looking at a bitint ICE, I've noticed we don't optimize
> in f1 and f5 functions below the 2 casts into just one at GIMPLE,
> even when optimize it in convert_to_integer if it appears in the same
> stmt.  The large match.pd simplification of two conversions in a row
> has many complex rules and as the testcase shows, everything else from
> the narrowest -> widest -> prec_in_between all integer conversions
> is already handled, either because the inside_unsignedp == inter_unsignedp
> rule kicks in, or the
>  && ((inter_unsignedp && inter_prec > inside_prec)
>  == (final_unsignedp && final_prec > inter_prec))
> one, but there is no reason why sign extension to from narrowest to
> widest type followed by truncation to something in between can't be
> done just as sign extension from narrowest to the final type.  After all,
> if the widest type is signed rather than unsigned, regardless of the final
> type signedness we already handle it that way.
> And since PR93044 we also handle it if the final precision is not wider
> than the inside precision.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2023-12-14  Jakub Jelinek  
>
>   PR tree-optimization/113024
>   * match.pd (two conversions in a row): Simplify scalar integer
>   sign-extension followed by truncation.
>
>   * gcc.dg/tree-ssa/pr113024.c: New test.
>
> --- gcc/match.pd.jj   2023-12-14 11:59:28.0 +0100
> +++ gcc/match.pd  2023-12-14 18:25:00.457961975 +0100
> @@ -4754,11 +4754,14 @@ (define_operator_list SYNC_FETCH_AND_AND
>  /* If we have a sign-extension of a zero-extended value, we can
> replace that by a single zero-extension.  Likewise if the
> final conversion does not change precision we can drop the
> -   intermediate conversion.  */
> +   intermediate conversion.  Similarly truncation of a sign-extension
> +   can be replaced by a single sign-extension.  */
>  (if (inside_int && inter_int && final_int
>&& ((inside_prec < inter_prec && inter_prec < final_prec
> && inside_unsignedp && !inter_unsignedp)
> -  || final_prec == inter_prec))
> +  || final_prec == inter_prec
> +  || (inside_prec < inter_prec && inter_prec > final_prec
> +  && !inside_unsignedp && inter_unsignedp)))

Just curious: is the inter_unsignedp part needed for correctness?
If it's bigger than both the initial and final type then I wouldn't
have expected its signedness to matter.

Thanks,
Richard

>   (ocvt @0))
>  
>  /* Two conversions in a row are not needed unless:
> --- gcc/testsuite/gcc.dg/tree-ssa/pr113024.c.jj   2023-12-14 
> 18:35:30.652225327 +0100
> +++ gcc/testsuite/gcc.dg/tree-ssa/pr113024.c  2023-12-14 18:37:42.056403418 
> +0100
> @@ -0,0 +1,22 @@
> +/* PR tree-optimization/113024 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-forwprop1" } */
> +/* Make sure we have just a single cast per function rather than 2 casts in 
> some cases.  */
> +/* { dg-final { scan-tree-dump-times " = \\\(\[a-z \]*\\\) \[xy_\]" 16 
> "forwprop1" { target { ilp32 || lp64 } } } } */
> +
> +unsigned int f1 (signed char x) { unsigned long long y = x; return y; }
> +unsigned int f2 (unsigned char x) { unsigned long long y = x; return y; }
> +unsigned int f3 (signed char x) { long long y = x; return y; }
> +unsigned int f4 (unsigned char x) { long long y = x; return y; }
> +int f5 (signed char x) { unsigned long long y = x; return y; }
> +int f6 (unsigned char x) { unsigned long long y = x; return y; }
> +int f7 (signed char x) { long long y = x; return y; }
> +int f8 (unsigned char x) { long long y = x; return y; }
> +unsigned int f9 (signed char x) { return (unsigned long long) x; }
> +unsigned int f10 (unsigned char x) { return (unsigned long long) x; }
> +unsigned int f11 (signed char x) { return (long long) x; }
> +unsigned int f12 (unsigned char x) { return (long long) x; }
> +int f13 (signed char x) { return (unsigned long long) x; }
> +int f14 (unsigned char x) { return (unsigned long long) x; }
> +int f15 (signed char x) { return (long long) x; }
> +int f16 (unsigned char x) { return (long long) x; }
>
>   Jakub


[Bug target/113035] New: RISC-V: -mtune=sifive-7-series additional dump failures found with bitmanip, zicond, and vector targets

2023-12-15 Thread ewlu at rivosinc dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113035

Bug ID: 113035
   Summary: RISC-V: -mtune=sifive-7-series additional dump
failures found with bitmanip, zicond, and vector
targets
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ewlu at rivosinc dot com
  Target Milestone: ---

Created attachment 56888
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56888=edit
testsuite failures for rv64 bitmanip and vector as of r14-6557-g767e2674875

I was testing different cpu/tunes for cleaning up the scheduler and saw that
there are significant differences in testsuite results between using
mtune=rocket (default) and mtune=sifive-7-series. 

Most notably, there are many failures in gcc.target/riscv/rvv/vsetvl,
gcc.target/riscv/zicond-primitiveSemantics_compare_*.c, and some with zbb and
zbs.

Full log output included in attachment

Configuration:
   
riscv-sim/-march=rv64gc_zba_zbb_zbc_zbs/-mabi=lp64d/-mtune=sifive-7-series/-mcpu=sifive-u74/-mcmodel=medlow
   
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mtune=sifive-7-series/-mcpu=sifive-u74/-mcmodel=medlow

[Bug tree-optimization/54742] Switch elimination in FSM loop

2023-12-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54742

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |5.0

[Bug target/113034] New: Miscompilation of __m128 ne comparison on LoongArch

2023-12-15 Thread c at jia dot je via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113034

Bug ID: 113034
   Summary: Miscompilation of __m128 ne comparison on LoongArch
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: c at jia dot je
  Target Milestone: ---

Compile and run the following code:

```
#include 
#include 
__m128i unord_vec(__m128 a, __m128 b) { return (a != a) | (b != b); }

int unord_float(float a, float b) { return (a != a) | (b != b); }

int main() {
  float nan = 0.0 / 0.0; // nan
  __m128 nan_vec = {nan, nan};
  int res_float = unord_float(nan, nan);
  __m128i res_vec = unord_vec(nan_vec, nan_vec);
  printf("%d %ld %ld\n", res_float, res_vec[0], res_vec[1]);
  return 0;
}
```

Compile commands: `gcc-14 -mlsx test.c -O -o test`. GCC version is 14.0.0
202231203 snapshot.

It does the `unordered` comparison between two floats. The expected output:

```
1 1 1
```

Actual output:

```
1 0 0
```

Reading the assembly, the `unord_vec` is implemented wrongly as `vfcmp.cne.s`:

```
unord_vec:
.LFB538 = .
.cfi_startproc
vinsgr2vr.d $vr0,$r4,0
vinsgr2vr.d $vr0,$r5,1
vinsgr2vr.d $vr1,$r6,0
vinsgr2vr.d $vr1,$r7,1
vfcmp.cne.s $vr0,$vr0,$vr0
vfcmp.cne.s $vr1,$vr1,$vr1
vor.v   $vr0,$vr0,$vr1
vpickve2gr.du   $r4,$vr0,0
vpickve2gr.du   $r5,$vr0,1
jr  $r1
.cfi_endproc
```

Whereas `unord_float` is correctly implemented as `fcmp.cune.s`:

```
unord_float:
.LFB539 = .
.cfi_startproc
addi.w  $r4,$r0,1   # 0x1
fcmp.cune.s $fcc0,$f0,$f0
bcnez   $fcc0,.L3
or  $r4,$r0,$r0
.L3:
addi.w  $r12,$r0,1  # 0x1
fcmp.cune.s $fcc1,$f1,$f1
bcnez   $fcc1,.L4
or  $r12,$r0,$r0
.L4:
or  $r4,$r4,$r12
andi$r4,$r4,1
jr  $r1
.cfi_endproc
```

So there is a mismatch on the `unordered` case. Besides, these functions can be
optimized to use `vfcmp.cun.s` and `fcmp.cun.s`.

[Bug c++/113032] RISCV linker relaxation leaves redundant addi (from load immediate)

2023-12-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113032

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |MOVED
 Status|UNCONFIRMED |RESOLVED

--- Comment #1 from Andrew Pinski  ---
The linker is part of the GNU binutils project, please report this issue to
them instead; https://sourceware.org/bugzilla .

[r14-6559 Regression] FAIL: gcc.dg/guality/pr58791-4.c -Os -DPREVENT_OPTIMIZATION line pr58791-4.c:32 i == 486 on Linux/x86_64

2023-12-15 Thread haochen.jiang
On Linux/x86_64,

8afdbcdd7abe1e3c7a81e07f34c256e7f2dbc652 is the first bad commit
commit 8afdbcdd7abe1e3c7a81e07f34c256e7f2dbc652
Author: Di Zhao 
Date:   Fri Dec 15 03:22:32 2023 +0800

Consider fully pipelined FMA in get_reassociation_width

caused

FAIL: gcc.dg/pr110279-2.c scan-tree-dump-not reassoc2 "was chosen for 
reassociation"
FAIL: gcc.dg/pr110279-2.c scan-tree-dump-times optimized "\\.FMA " 3

with GCC configured with

../../gcc/configure 
--prefix=/export/users/haochenj/src/gcc-bisect/master/master/r14-6559/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=gcc.dg/pr110279-2.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=gcc.dg/pr110279-2.c 
--target_board='unix{-m64}'"

(Please do not reply to this email, for question about this report, contact me 
at haochen dot jiang at intel.com)
(If you met problems with cascadelake related, disabling AVX512F in command 
line might save that.)
(However, please make sure that there is no potential problems with AVX512.)


[Bug c++/83417] Pointer-to-member template parameter with auto member type dependent container type does not work (C++17)

2023-12-15 Thread waffl3x at protonmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83417

waffl3x  changed:

   What|Removed |Added

 CC||waffl3x at protonmail dot com

--- Comment #5 from waffl3x  ---
https://godbolt.org/z/Kxco7c5Es

Still not working in trunk, was it ever decided on whether or not this
is well-formed?

```
template struct C {};
template 
void zoink(C) { }

struct S { int _m; };

void go() { zoink(C<::_m>{}); }
```
https://godbolt.org/z/ohG9ra5en
This is the workaround I would use, it has better ergonomics anyway. I
believe we should decide on whether this truly is a bug or not and if
not just simply close it.

Credit to Tsche for pointing this case out to me.

Re: Question about creating global varaiable during IPA PASS.

2023-12-15 Thread Thomas Schwinge
Hi Hanke!

On 2023-12-13T17:04:57+0800, Hanke Zhang via Gcc  wrote:
> Hi, I'm trying to create a global variable in my own PASS which
> located at the LATE_IPA_PASSES. (I'm using GCC 10.3.0.)

I can't comment on IPA aspects, or whether something was different on
oldish GCC 10 (why using that one, by the way?), and I've not actually
verified what you're doing here:

> And after creating it, I added the attributes like the following.
>
> // 1. create the var
> tree new_name = get_identifier (xx);
> tree new_type = build_pointer_type (xx);
> tree new_var = build_decl (UNKNOWN_LOCATION, VAR_DECL, new_name, new_type);
> add_attributes (new_var);
>
> static void
> add_attributes (tree var)
> {
> DECL_ARTIFICIAL (var) = 1;
> DECL_EXTERNAL (var) = 0;
> TREE_STATIC (var) = 1;
> TREE_PUBLIC (var) = 1;
> TREE_USED (var) = 1;
> DECL_CONTEXT (var) = NULL_TREE;
> TREE_THIS_VOLATILE (var) = 0;
> TREE_ADDRESSABLE (var) = 0;
> TREE_READONLY (var) = 0;
> if (is_global_var (var))
>   set_decl_tls_model (var, TLS_MODEL_NONE);
> }
>
> But when I try to compile some example files with -flto, error occurs.
>
> /usr/bin/ld: xxx.ltrans0.ltrans.o: in function `xxx':
> xxx.c: undefined reference to `glob_var'
> xxx.c: undefined reference to `glob_var'
> xxx.c: undefined reference to `glob_var'
>
> Here `glob_var' is the global varaiable created in my PASS.
>
> I would like to ask, am I using some attributes incorrectly?

..., but are you maybe simply missing to
'varpool_node::add (var);' or 'varpool_node::finalize_decl (var);' or
something like that?  See other uses of those, and description in
'gcc/cgraph.h', 'struct [...] varpool_node':

  /* Add the variable DECL to the varpool.
 Unlike finalize_decl function is intended to be used
 by middle end and allows insertion of new variable at arbitrary point
 of compilation.  */
  static void add (tree decl);

  /* Mark DECL as finalized.  By finalizing the declaration, frontend 
instruct
 the middle end to output the variable to asm file, if needed or 
externally
 visible.  */
  static void finalize_decl (tree decl);

If that's not it, we'll have to look in more detail.


Grüße
 Thomas


Re: Discussion about arm/aarch64 testcase failures seen with patch for PR111673

2023-12-15 Thread Surya Kumari Jangala via Gcc
Hi Richard,
Here are more details about the testcase failure and my analysis/fix:

Testcase:

void f(int *i)
{
if (!i)
return;
else
{
__builtin_printf("Hi");
*i=0;
}
}

--

Assembly w/o patch:
cbz x0, .L7
stp x29, x30, [sp, -32]!
mov x29, sp
str x19, [sp, 16]
mov x19, x0
adrpx0, .LC0
add x0, x0, :lo12:.LC0
bl  printf
str wzr, [x19]
ldr x19, [sp, 16]
ldp x29, x30, [sp], 32
ret
.p2align 2,,3
.L7:
ret

---

Assembly w/ patch:
stp x29, x30, [sp, -32]!
mov x29, sp
str x0, [sp, 24]
cbz x0, .L1
adrpx0, .LC0
add x0, x0, :lo12:.LC0
bl  printf
ldr x1, [sp, 24]
str wzr, [x1]
.L1:
ldp x29, x30, [sp], 32
ret


As we can see above, w/o patch the test case gets shrink wrapped.

Input RTL to the LRA pass (the RTL is same both w/ and w/o patch):

BB2:
  set r95, x0
  set r92, r95
  if (r92 eq 0) jump BB4
BB3:
  set x0, symbol-ref("Hi")
  x0 = call printf
  set mem(r92), 0
BB4:
  ret


Register assignment by IRA:
w/o patch:
  r92-->x19
  r95-->x0
  r94-->x0

w/ patch:
  r92-->x1
  r95-->x0
  r94-->x0


RTL after LRA:

w/o patch:
BB2:
  set x19, x0
  if (x19 eq 0) jump BB4
BB3:
  set x0, symbol-ref("Hi")
  x0 = call printf
  set mem(x19), 0
BB4:
  ret


w/ patch:
BB2:
  set x1, x0
  set mem(sp+24), x1
  if (x1 eq 0) jump BB4
BB3:
  set x0, symbol-ref("Hi")
  x0 = call printf
  set x1, mem(sp+24)
  set mem(x1), 0
BB4:
  ret


The difference between w/o patch and w/ patch is that w/o patch, a callee-save
register (x19) is chosen to hold the value of x0 (input parameter register). 
While
w/ patch, a caller-save register (x1) is chosen.

W/o patch, during the shrink wrap pass, first copy propagation is done and
the 'if' insn in BB2 is changed as follows:
  set x19, x0
  if (x19 eq 0) jump BB4

changed to:
  set x19, x0
  if (x0 eq 0) jump BB4   

Next, the insn "set x19, x0" is moved down the cfg to BB3. Since x19 is a
callee-save register, prolog gets generated in BB3 thereby resulting in
successful shrink wrapping.

W/ patch, during the shrink wrap pass, copy propagation changes BB2 as follows:
  set x1, x0
  set mem(sp+24), x1
  if (x1 eq 0) jump BB4

changed to:
  set x1, x0
  set mem(sp+24), x0
  if (x0 eq 0) jump BB4

However the store insn (set mem[sp+24], x0) cannot be moved down to BB3.
hence prolog gets generated in BB2 itself due to the use of 'sp'. Thereby
shrink wrap fails.

The store insn (which basically saves x1 to stack) is generated by the
LRA pass. This insn is needed because x1 is a caller-save register and we
have a call insn that will clobber this register. However, the store insn is 
generated
in the entry BB (BB2) instead of in BB3 which has the call insn. If the store
is generated in BB3, then the testcase will be shrink wrapped successfully.
In fact, it is more efficient if the store occurs only in the path containing
the printf call instead of occurring in the entry bb.

The reason why LRA generates the store insn in the entry bb is as follows:
LRA emits insns to save caller-save registers in the inheritance/splitting pass.
In this pass, LRA builds EBBs (Extended Basic Block) and traverses the insns in
the EBBs in reverse order from the last insn to the first insn. When LRA sees a
write to a pseudo (that has been assigned a caller-save register), and there is 
a
read following the write, with an intervening call insn between the write and 
read,
then LRA generates a spill immediately after the write and a restore immediately
before the read. The spill is needed because the call insn will clobber the
caller-save register.

In the above testcase, LRA forms two EBBs: the first EBB contains BB2 & BB3 
while
the second EBB contains BB4. 

In BB2, there is a write to x1 in the insn : 
set r92, r95 //r92 is assigned x1 and r95 is assigned x0

In BB3, there is a read of x1 after the call
insn.
set mem(r92), 0   // r92 is assigned x1

So LRA generates a spill in BB2 after the write to x1.

I have a patch (bootstrapped and regtested on powerpc) that makes changes in
LRA to save caller-save registers before a call instead of after the write to 
the
caller-save register. With this patch, both the above test gets successfully
shrink wrapped. After committing the patch for PR111673, I plan to get the 
LRA fix reviewed.

Please let me know if you need more information.

Regards,
Surya


On 14/12/23 9:41 pm, Richard Earnshaw (lists) wrote:
> On 14/12/2023 07:17, Surya Kumari Jangala via Gcc wrote:
>> Hi Richard,
>> Thanks a lot for your response!
>>
>> Another failure reported by the Linaro CI is as follows:
>>
>> Running gcc:gcc.dg/dg.exp ...
>> FAIL: gcc.dg/ira-shrinkwrap-prep-1.c scan-rtl-dump pro_and_epilogue 
>> "Performing 

[Bug target/113033] New: GCC 14 (20231203 snapshot) ICE when building LSX vector rotate code on LoongArch

2023-12-15 Thread c at jia dot je via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113033

Bug ID: 113033
   Summary: GCC 14 (20231203 snapshot) ICE when building LSX
vector rotate code on LoongArch
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: c at jia dot je
  Target Milestone: ---

Source code:

```
#include 

v2u64 test(v2u64 a, int imm) {
  return (a << (imm & 63)) | (a >> (64 - (imm & 63)));
}

```

Command line to reproduce:

```
gcc-14 -mlsx -c test.c
```

Error message:

```
test.c: In function ‘test’:
test.c:5:1: error: unrecognizable insn:
5 | }
  | ^
(insn 16 15 17 2 (set (reg:DI 92)
(and:DI (neg:DI (reg:DI 80 [ _1 ]))
(const_int 63 [0x3f]))) "test.c":4:28 -1
 (nil))
during RTL pass: vregs
test.c:5:1: internal compiler error: in extract_insn, at recog.cc:2804
0x591d419f internal_error(char const*, ...)
???:0
0x577b4d87 fancy_abort(char const*, int, char const*)
???:0
0x577998bb _fatal_insn(char const*, rtx_def const*, char const*, int, char
const*)
???:0
0x577998ef _fatal_insn_not_found(rtx_def const*, char const*, int, char
const*)
???:0
0x580704f7 extract_insn(rtx_insn*)
???:0
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.
```

[Bug c++/113032] New: RISCV linker relaxation leaves redundant addi (from load immediate)

2023-12-15 Thread iwfinlay at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113032

Bug ID: 113032
   Summary: RISCV linker relaxation leaves redundant addi (from
load immediate)
   Product: gcc
   Version: 12.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: iwfinlay at gmail dot com
  Target Milestone: ---

Created attachment 56887
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56887=edit
Source files and save-temps for ELF build

In the ELF image for RISCV 32-bit, the lower 12-bit parts of a load immediate
are zero but not removed. Hence, it appears as a 'mv' in the objdump of the
program ELF. 

20010064 <_Z6globalv>:
20010064:   8537lui a0,0x8
20010068:   00050513mv  a0,a0   <-- (1)
2001006c:   8067ret

20010070 <_Z4initv>:
20010070:   87b7lui a5,0x8
20010074:   00078793mv  a5,a5   <-- (2)
20010078:   3e700713li  a4,999
2001007c:   00e7a223sw  a4,4(a5) # 8004
<__global_pointer$+0xf804>
20010080:   0ea00713li  a4,234
20010084:   00e7a423sw  a4,8(a5)
20010088:   8067ret

2001008c <_Z5pcnt0v>:
2001008c:   87b7lui a5,0x8
20010090:   00078793mv  a5,a5   <-- (3)
20010094:   0047a703lw  a4,4(a5) # 8004
<__global_pointer$+0xf804>
20010098:   60271713cpopa4,a4

The command line is:
riscv64-unknown-elf-g++ --save-temps -march=rv32imv_zbb_zbs -mabi=ilp32 -O3
-Wall -Wextra -o relax.elf main.cpp test.cpp startup.cpp -fno-builtin -static
-fno-common -mcmodel=medlow -nostdlib -T link.ld

The behavior is the same if I drop -fno-builtin -fno-common -mcmodel (just need
to provide memset).

I'm attaching a tar-file for the above.

I reviewed 91713 but concluded that it's different. I also filed yesterday. I
apologize for wasting reviewer time. It's the same test case but I'm now
including the main to show overall optimization.

Thanks for the previous feedback.

[Bug modula2/112946] Assignment of string to enumeration or set crashes

2023-12-15 Thread gaiusmod2 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112946

--- Comment #5 from gaiusmod2 at gmail dot com ---
many thanks for the bug report - all fixed in gcc master branch.

Re: [PATCH v4 10/11] aarch64: Add new load/store pair fusion pass

2023-12-15 Thread Richard Sandiford
Alex Coplan  writes:
> This is a v6 of the aarch64 load/store pair fusion pass, which
> addresses the feedback from Richard's last review here:
>
> https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640539.html
>
> In particular this version implements the suggested changes which
> greatly simplify the double list walk.
>
> Bootstrapped/regtested as a series on aarch64-linux-gnu, OK for trunk?
>
> Thanks,
> Alex
>
> -- >8 --
>
> This adds a new aarch64-specific RTL-SSA pass dedicated to forming load
> and store pairs (LDPs and STPs).
>
> As a motivating example for the kind of thing this improves, take the
> following testcase:
>
> extern double c[20];
>
> double f(double x)
> {
>   double y = x*x;
>   y += c[16];
>   y += c[17];
>   y += c[18];
>   y += c[19];
>   return y;
> }
>
> for which we currently generate (at -O2):
>
> f:
> adrpx0, c
> add x0, x0, :lo12:c
> ldp d31, d29, [x0, 128]
> ldr d30, [x0, 144]
> fmadd   d0, d0, d0, d31
> ldr d31, [x0, 152]
> faddd0, d0, d29
> faddd0, d0, d30
> faddd0, d0, d31
> ret
>
> but with the pass, we generate:
>
> f:
> .LFB0:
> adrpx0, c
> add x0, x0, :lo12:c
> ldp d31, d29, [x0, 128]
> fmadd   d0, d0, d0, d31
> ldp d30, d31, [x0, 144]
> faddd0, d0, d29
> faddd0, d0, d30
> faddd0, d0, d31
> ret
>
> The pass is local (only considers a BB at a time).  In theory, it should
> be possible to extend it to run over EBBs, at least in the case of pure
> (MEM_READONLY_P) loads, but this is left for future work.
>
> The pass works by identifying two kinds of bases: tree decls obtained
> via MEM_EXPR, and RTL register bases in the form of RTL-SSA def_infos.
> If a candidate memory access has a MEM_EXPR base, then we track it via
> this base, and otherwise if it is of a simple reg +  form, we track
> it via the RTL-SSA def_info for the register.
>
> For each BB, for a given kind of base, we build up a hash table mapping
> the base to an access_group.  The access_group data structure holds a
> list of accesses at each offset relative to the same base.  It uses a
> splay tree to support efficient insertion (while walking the bb), and
> the nodes are chained using a linked list to support efficient
> iteration (while doing the transformation).
>
> For each base, we then iterate over the access_group to identify
> adjacent accesses, and try to form load/store pairs for those insns that
> access adjacent memory.
>
> The pass is currently run twice, both before and after register
> allocation.  The first copy of the pass is run late in the pre-RA RTL
> pipeline, immediately after sched1, since it was found that sched1 was
> increasing register pressure when the pass was run before.  The second
> copy of the pass runs immediately before peephole2, so as to get any
> opportunities that the existing ldp/stp peepholes can handle.
>
> There are some cases that we punt on before RA, e.g.
> accesses relative to eliminable regs (such as the soft frame pointer).
> We do this since we can't know the elimination offset before RA, and we
> want to avoid the RA reloading the offset (due to being out of ldp/stp
> immediate range) as this can generate worse code.
>
> The post-RA copy of the pass is there to pick up the crumbs that were
> left behind / things we punted on in the pre-RA pass.  Among other
> things, it's needed to handle accesses relative to the stack pointer.
> It can also handle code that didn't exist at the time the pre-RA pass
> was run (spill code, prologue/epilogue code).
>
> This is an initial implementation, and there are (among other possible
> improvements) the following notable caveats / missing features that are
> left for future work, but could give further improvements:
>
>  - Moving accesses between BBs within in an EBB, see above.
>  - Out-of-range opportunities: currently the pass refuses to form pairs
>if there isn't a suitable base register with an immediate in range
>for ldp/stp, but it can be profitable to emit anchor addresses in the
>case that there are four or more out-of-range nearby accesses that can
>be formed into pairs.  This is handled by the current ldp/stp
>peepholes, so it would be good to support this in the future.
>  - Discovery: currently we prioritize MEM_EXPR bases over RTL bases, which can
>lead to us missing opportunities in the case that two accesses have 
> distinct
>MEM_EXPR bases (i.e. different DECLs) but they are still adjacent in memory
>(e.g. adjacent variables on the stack).  I hope to address this for GCC 15,
>hopefully getting to the point where we can remove the ldp/stp peepholes 
> and
>scheduling hooks.  Furthermore it would be nice to make the pass aware of
>section anchors (adding these as a third kind of base) allowing merging
>accesses to adjacent variables within 

[Bug modula2/112946] Assignment of string to enumeration or set crashes

2023-12-15 Thread gaius at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112946

Gaius Mulley  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Gaius Mulley  ---
Closing now the patch has been applied.

[Bug modula2/112946] Assignment of string to enumeration or set crashes

2023-12-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112946

--- Comment #3 from GCC Commits  ---
The master branch has been updated by Gaius Mulley :

https://gcc.gnu.org/g:7d7a480eedf0a195318d0fce2c9c57acae43ec9d

commit r14-6596-g7d7a480eedf0a195318d0fce2c9c57acae43ec9d
Author: Gaius Mulley 
Date:   Fri Dec 15 15:26:48 2023 +

PR modula2/112946 ICE assignment of string to enumeration or set

This patch introduces type checking during FoldBecomes and also
adds set/string/enum checking to the type checker.  FoldBecomes
has been re-written, tidied up and re-factored.

gcc/m2/ChangeLog:

PR modula2/112946
* gm2-compiler/M2Check.mod (checkConstMeta): New procedure
function.
(checkConstEquivalence): New procedure function.
(doCheckPair): Add call to checkConstEquivalence.
* gm2-compiler/M2GenGCC.mod (ResolveConstantExpressions): Call
FoldBecomes with reduced parameters.
(FoldBecomes): Re-write.
(TryDeclareConst): New procedure.
(RemoveQuads): New procedure.
(DeclaredOperandsBecomes): New procedure function.
(TypeCheckBecomes): New procedure function.
(PerformFoldBecomes): New procedure.
* gm2-compiler/M2Range.mod (FoldAssignment): Call
AssignmentTypeCompatible to check des expr compatibility.
* gm2-compiler/M2SymInit.mod (CheckReadBeforeInitQuad): Remove
parameter lst.
(FilterCheckReadBeforeInitQuad): Remove parameter lst.
(CheckReadBeforeInitFirstBasicBlock): Remove parameter lst.
Call FilterCheckReadBeforeInitQuad without lst.

gcc/testsuite/ChangeLog:

PR modula2/112946
* gm2/iso/fail/badassignment.mod: New test.
* gm2/iso/fail/badexpression.mod: New test.
* gm2/iso/fail/badexpression2.mod: New test.

Signed-off-by: Gaius Mulley 

[PATCH v4 10/11] aarch64: Add new load/store pair fusion pass

2023-12-15 Thread Alex Coplan
This is a v6 of the aarch64 load/store pair fusion pass, which
addresses the feedback from Richard's last review here:

https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640539.html

In particular this version implements the suggested changes which
greatly simplify the double list walk.

Bootstrapped/regtested as a series on aarch64-linux-gnu, OK for trunk?

Thanks,
Alex

-- >8 --

This adds a new aarch64-specific RTL-SSA pass dedicated to forming load
and store pairs (LDPs and STPs).

As a motivating example for the kind of thing this improves, take the
following testcase:

extern double c[20];

double f(double x)
{
  double y = x*x;
  y += c[16];
  y += c[17];
  y += c[18];
  y += c[19];
  return y;
}

for which we currently generate (at -O2):

f:
adrpx0, c
add x0, x0, :lo12:c
ldp d31, d29, [x0, 128]
ldr d30, [x0, 144]
fmadd   d0, d0, d0, d31
ldr d31, [x0, 152]
faddd0, d0, d29
faddd0, d0, d30
faddd0, d0, d31
ret

but with the pass, we generate:

f:
.LFB0:
adrpx0, c
add x0, x0, :lo12:c
ldp d31, d29, [x0, 128]
fmadd   d0, d0, d0, d31
ldp d30, d31, [x0, 144]
faddd0, d0, d29
faddd0, d0, d30
faddd0, d0, d31
ret

The pass is local (only considers a BB at a time).  In theory, it should
be possible to extend it to run over EBBs, at least in the case of pure
(MEM_READONLY_P) loads, but this is left for future work.

The pass works by identifying two kinds of bases: tree decls obtained
via MEM_EXPR, and RTL register bases in the form of RTL-SSA def_infos.
If a candidate memory access has a MEM_EXPR base, then we track it via
this base, and otherwise if it is of a simple reg +  form, we track
it via the RTL-SSA def_info for the register.

For each BB, for a given kind of base, we build up a hash table mapping
the base to an access_group.  The access_group data structure holds a
list of accesses at each offset relative to the same base.  It uses a
splay tree to support efficient insertion (while walking the bb), and
the nodes are chained using a linked list to support efficient
iteration (while doing the transformation).

For each base, we then iterate over the access_group to identify
adjacent accesses, and try to form load/store pairs for those insns that
access adjacent memory.

The pass is currently run twice, both before and after register
allocation.  The first copy of the pass is run late in the pre-RA RTL
pipeline, immediately after sched1, since it was found that sched1 was
increasing register pressure when the pass was run before.  The second
copy of the pass runs immediately before peephole2, so as to get any
opportunities that the existing ldp/stp peepholes can handle.

There are some cases that we punt on before RA, e.g.
accesses relative to eliminable regs (such as the soft frame pointer).
We do this since we can't know the elimination offset before RA, and we
want to avoid the RA reloading the offset (due to being out of ldp/stp
immediate range) as this can generate worse code.

The post-RA copy of the pass is there to pick up the crumbs that were
left behind / things we punted on in the pre-RA pass.  Among other
things, it's needed to handle accesses relative to the stack pointer.
It can also handle code that didn't exist at the time the pre-RA pass
was run (spill code, prologue/epilogue code).

This is an initial implementation, and there are (among other possible
improvements) the following notable caveats / missing features that are
left for future work, but could give further improvements:

 - Moving accesses between BBs within in an EBB, see above.
 - Out-of-range opportunities: currently the pass refuses to form pairs
   if there isn't a suitable base register with an immediate in range
   for ldp/stp, but it can be profitable to emit anchor addresses in the
   case that there are four or more out-of-range nearby accesses that can
   be formed into pairs.  This is handled by the current ldp/stp
   peepholes, so it would be good to support this in the future.
 - Discovery: currently we prioritize MEM_EXPR bases over RTL bases, which can
   lead to us missing opportunities in the case that two accesses have distinct
   MEM_EXPR bases (i.e. different DECLs) but they are still adjacent in memory
   (e.g. adjacent variables on the stack).  I hope to address this for GCC 15,
   hopefully getting to the point where we can remove the ldp/stp peepholes and
   scheduling hooks.  Furthermore it would be nice to make the pass aware of
   section anchors (adding these as a third kind of base) allowing merging
   accesses to adjacent variables within the same section.

gcc/ChangeLog:

* config.gcc: Add aarch64-ldp-fusion.o to extra_objs for aarch64.
* config/aarch64/aarch64-passes.def: Add copies of pass_ldp_fusion
before and after RA.
* 

[Bug modula2/112946] Assignment of string to enumeration or set crashes

2023-12-15 Thread gaius at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112946

--- Comment #2 from Gaius Mulley  ---
Created attachment 56886
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56886=edit
Proposed fix

Here is the proposed fix.

[Bug c++/70435] section attribute of a function template is not honored.

2023-12-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70435

--- Comment #9 from GCC Commits  ---
The master branch has been updated by Patrick Palka :

https://gcc.gnu.org/g:ea7bebff7cc5a5eb780a6ca646cb77cad1b625d6

commit r14-6595-gea7bebff7cc5a5eb780a6ca646cb77cad1b625d6
Author: Patrick Palka 
Date:   Fri Dec 15 10:03:31 2023 -0500

c++: section attribute on templates [PR70435, PR88061]

The section attribute currently has no effect on templates because the
call to set_decl_section_name only happens at parse time (on the
dependent decl) and not also at instantiation time.  This patch fixes
this by propagating the section name from the template to the
instantiation.

PR c++/70435
PR c++/88061

gcc/cp/ChangeLog:

* pt.cc (tsubst_function_decl): Propagate DECL_SECTION_NAME
via set_decl_section_name.
(tsubst_decl) : Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/attr-section1.C: New test.
* g++.dg/ext/attr-section1a.C: New test.
* g++.dg/ext/attr-section2.C: New test.
* g++.dg/ext/attr-section2a.C: New test.
* g++.dg/ext/attr-section2b.C: New test.

[Bug c++/88061] section attributes of variable templates are ignored

2023-12-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88061

--- Comment #9 from GCC Commits  ---
The master branch has been updated by Patrick Palka :

https://gcc.gnu.org/g:ea7bebff7cc5a5eb780a6ca646cb77cad1b625d6

commit r14-6595-gea7bebff7cc5a5eb780a6ca646cb77cad1b625d6
Author: Patrick Palka 
Date:   Fri Dec 15 10:03:31 2023 -0500

c++: section attribute on templates [PR70435, PR88061]

The section attribute currently has no effect on templates because the
call to set_decl_section_name only happens at parse time (on the
dependent decl) and not also at instantiation time.  This patch fixes
this by propagating the section name from the template to the
instantiation.

PR c++/70435
PR c++/88061

gcc/cp/ChangeLog:

* pt.cc (tsubst_function_decl): Propagate DECL_SECTION_NAME
via set_decl_section_name.
(tsubst_decl) : Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/attr-section1.C: New test.
* g++.dg/ext/attr-section1a.C: New test.
* g++.dg/ext/attr-section2.C: New test.
* g++.dg/ext/attr-section2a.C: New test.
* g++.dg/ext/attr-section2b.C: New test.

[Bug c++/109715] abi_tag attribute is not applied to variable templates

2023-12-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109715

--- Comment #1 from GCC Commits  ---
The master branch has been updated by Patrick Palka :

https://gcc.gnu.org/g:46984fa259436ad50afb50a46a0a16e83bdf7172

commit r14-6594-g46984fa259436ad50afb50a46a0a16e83bdf7172
Author: Patrick Palka 
Date:   Fri Dec 15 10:03:26 2023 -0500

c++: abi_tag attribute on templates [PR109715]

We need to look through TEMPLATE_DECL when looking up the abi_tag
attribute (as with other function/variable declaration attributes).

PR c++/109715

gcc/cp/ChangeLog:

* mangle.cc (get_abi_tags): Strip TEMPLATE_DECL before looking
up the abi_tag attribute.

gcc/testsuite/ChangeLog:

* g++.dg/abi/abi-tag25.C: New test.
* g++.dg/abi/abi-tag25a.C: New test.

Re: [committed] libstdc++: Implement C++23 header [PR107760]

2023-12-15 Thread Tim Song
On Fri, Dec 15, 2023 at 4:43 AM Jonathan Wakely  wrote:

> On Fri, 15 Dec 2023 at 01:17, Tim Song wrote:
> >
> > On Thu, Dec 14, 2023 at 6:05 PM Jonathan Wakely 
> wrote:
> >> +  inline void
> >> +  vprint_unicode(ostream& __os, string_view __fmt, format_args __args)
> >> +  {
> >> +ostream::sentry __cerb(__os);
> >> +if (__cerb)
> >> +  {
> >> +
> >> +   const streamsize __w = __os.width();
> >> +   const bool __left
> >> + = (__os.flags() & ios_base::adjustfield) == ios_base::left;
> >
> >
> > I'm pretty sure - when I wrote this wording anyway - that the intent was
> that it was just an unformatted write at the end. The wording in
> [ostream.formatted.print] doesn't use the "determines padding" words of
> power that would invoke [ostream.formatted.reqmts]/3.
>
> Ah, OK. I misunderstood "formatted output function" as implying
> padding, failing to notice that we need those words of power to be
> present. My thinking was that if the stream has padding set in its
> format flags, it could be surprising if they're ignored by a formatted
> output function. And padding in the format string applies to
> individual replacement fields, not the whole string, and it's hard to
> use the stream's fill character and alignment.
>

But we would get none of the Unicode-aware padding logic we
do in format, which puts it in a very weird place.

And for cases where Unicode is not a problem, it's easy to get padding
with just os << std::format(...);


> You can do this to use the ostream's width:
>
> std::print("{0:{1}}", std::format(...), os.width());
>
> But to reuse its fill char and adjustfield you need to do something
> awful like I did in the committed code:
>
> std::string_view align;
> if (os.flags() & ios::adjustfield) == ios::right)
>   align = ">"
> auto fs = std::format("{{:{}{}{}}}", os.fill(), align, os.width());
> std::vprint_nonunicode(os, fs, std::make_args(std::format(...)));


> And now you have to hardcode a choice between vprint_unicode and
> vprint_nonunicode, instead of letting std::print decide it. Let's hope
> nobody ever needs to do any of that ;-)
>

At least the upcoming runtime_format alleviates that :)


>
> I'll remove the code for padding the padding, thanks for checking the
> patch.
>
>


Re: [pushed] testsuite: move more analyzer test cases to c-c++-common (3) [PR96395]

2023-12-15 Thread Rainer Orth
David Malcolm  writes:

> Move a further 268 tests from gcc.dg/analyzer to c-c++-common/analyzer.
>
> Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
> Pushed to trunk as r14-6564-gae034b9106fbdd.

This patch introduced 840 additional FAILs on i386-pc-solaris2.11, no
doubt more instances of PR analyzer/111475.  Is this supposed to work
anywhere but Linux?  Right now the analyzer testsuite is a total
nightmare...

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


[Bug c/44179] warn about sizeof(char) and sizeof('x')

2023-12-15 Thread zack+srcbugz at owlfolio dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44179

--- Comment #3 from Zack Weinberg  ---
It's come to my attention that people fairly often write `sizeof('x')` (or some
other character constant), _expecting_ it to evaluate as `sizeof(char)`.  But
this is only true in C++. In C, `sizeof('x')` evaluates as `sizeof(int)`.

See
http://codesearch.debian.net/search?q=filetype%3Ac+%5Cbsizeof%5Cs*%5C%28%5Cs*%27=0
for many examples.

It's probably more important to warn about this than about `sizeof(char)`
itself.

[Bug sanitizer/112727] [11/12/13 Regression] UBSAN creates GIMPLE path with uninitialized variable

2023-12-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112727

--- Comment #10 from GCC Commits  ---
The releases/gcc-13 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:a982b9cb690a163434f1ac5a0901548b594205f2

commit r13-8160-ga982b9cb690a163434f1ac5a0901548b594205f2
Author: Jakub Jelinek 
Date:   Fri Dec 8 20:56:48 2023 +0100

c++: Unshare folded SAVE_EXPR arguments during cp_fold [PR112727]

The following testcase is miscompiled because two ubsan instrumentations
run into each other.
The first one is the shift instrumentation.  Before the C++ FE calls
it, it wraps the 2 shift arguments with cp_save_expr, so that side-effects
in them aren't evaluated multiple times.  And, ubsan_instrument_shift
itself uses unshare_expr on any uses of the operands to make sure further
modifications in them don't affect other copies of them (the only not
unshared ones are the one the caller then uses for the actual operation
after the instrumentation, which means there is no tree sharing).

Now, if there are side-effects in the first operand like say function
call, cp_save_expr wraps it into a SAVE_EXPR, and ubsan_instrument_shift
in this mode emits something like
if (..., SAVE_EXPR , SAVE_EXPR  > const)
 __ubsan_handle_shift_out_of_bounds (..., SAVE_EXPR , ...);
and caller adds
SAVE_EXPR  << SAVE_EXPR 
after it in a COMPOUND_EXPR.  So far so good.

If there are no side-effects and cp_save_expr doesn't create SAVE_EXPR,
everything is ok as well because of the unshare_expr.
We have
if (..., SAVE_EXPR  > const)
 __ubsan_handle_shift_out_of_bounds (..., ptr->something[i], ...);
and
ptr->something[i] << SAVE_EXPR 
where ptr->something[i] is unshared.

In the testcase below, the !x->s[j] ? 1 : 0 expression is wrapped initially
into a SAVE_EXPR though, and unshare_expr doesn't unshare SAVE_EXPRs nor
anything used in them for obvious reasons, so we end up with:
if (..., SAVE_EXPR (x)->s[j] ?
1 : 0>, SAVE_EXPR  > const)
 __ubsan_handle_shift_out_of_bounds (..., SAVE_EXPR (x)->s[j] ? 1 : 0>, ...);
and
SAVE_EXPR (x)->s[j] ? 1 : 0> <<
SAVE_EXPR 
So far good as well.  But later during cp_fold of the SAVE_EXPR we find
out that VIEW_CONVERT_EXPR(x)->s[j] ? 0 : 1 is actually
invariant (has TREE_READONLY set) and so cp_fold simplifies the above to
if (..., SAVE_EXPR  > const)
 __ubsan_handle_shift_out_of_bounds (..., (bool) VIEW_CONVERT_EXPR(x)->s[j] ? 0 : 1, ...);
and
((bool) VIEW_CONVERT_EXPR(x)->s[j] ? 0 : 1) << SAVE_EXPR

with the s[j] ARRAY_REFs and other expressions shared in between the two
uses (and obviously the expression optimized away from the COMPOUND_EXPR in
the if condition.

Then comes another ubsan instrumentation at genericization time,
this time to instrument the ARRAY_REFs with strict bounds checking,
and replaces the s[j] in there with s[.UBSAN_BOUNDS (0B, SAVE_EXPR, 8),
SAVE_EXPR]
As the trees are shared, it does that just once though.
And as the if body is gimplified first, the SAVE_EXPR is evaluated
inside
of the if body and when it is used again after the if, it uses a
potentially
uninitialized value of j.1 (always uninitialized if the shift count isn't
out of bounds).

The following patch fixes that by unshare_expr unsharing the folded
argument
of a SAVE_EXPR if we've folded the SAVE_EXPR into an invariant and it is
used more than once.

2023-12-08  Jakub Jelinek  

PR sanitizer/112727
* cp-gimplify.cc (cp_fold): If SAVE_EXPR has been previously
folded, unshare_expr what is returned.

* c-c++-common/ubsan/pr112727.c: New test.

(cherry picked from commit 6ddaf06e375e1c15dcda338697ab6ea457e6f497)

[Bug middle-end/112733] [14 Regression] ICE: Segmentation fault in wide-int.cc during GIMPLE pass: sccp

2023-12-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112733

--- Comment #16 from GCC Commits  ---
The releases/gcc-13 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:8c0ea9320ce4d2381ebf043cd20a0afce88da880

commit r13-8159-g8c0ea9320ce4d2381ebf043cd20a0afce88da880
Author: Jakub Jelinek 
Date:   Wed Nov 29 12:26:50 2023 +0100

fold-const: Fix up multiple_of_p [PR112733]

We ICE on the following testcase when wi::multiple_of_p is called on
widest_int 1 and -128 with UNSIGNED.  I still need to work on the
actual wide-int.cc issue, the latest patch attached to the PR regressed
bitint-{38,39}.c, so will need to debug that, but there is a clear bug
on the fold-const.cc side as well - widest_int is a signed representation
by definition, using UNSIGNED with it certainly doesn't match what was
intended, because -128 as the second operand effectively means unsigned
131072 bit 0xf80 integer, not the signed char -128
that appeared in the source.

In the INTEGER_CST case a few lines above this we already use
case INTEGER_CST:
  if (TREE_CODE (bottom) != INTEGER_CST || integer_zerop (bottom))
return false;
  return wi::multiple_of_p (wi::to_widest (top), wi::to_widest
(bottom),
SIGNED);
so I think using SIGNED with widest_int is best there (compared to the
other choices in the PR).

2023-11-29  Jakub Jelinek  

PR middle-end/112733
* fold-const.cc (multiple_of_p): Pass SIGNED rather than
UNSIGNED for wi::multiple_of_p on widest_int arguments.

* gcc.dg/pr112733.c: New test.

(cherry picked from commit 5c95bf945c632925efba86dd5dceccdb9da8884c)

  1   2   >