[Bug target/115519] s390 fallout from removing vcond{,u,eq} patterns

2024-06-17 Thread stefansf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115519

--- Comment #1 from Stefan Schulze Frielinghaus  
---
For example, for function vesrlf_ge from vcond-shift.c we do not end up with

vl  %v2,0(%r2),3
vl  %v0,16(%r2),3
lgr %r1,%r2
vesrlf  %v4,%v2,31
vesrlf  %v6,%v0,31
vst %v4,0(%r1),3
vst %v6,16(%r1),3
br  %r14

anymore but

vl  %v0,0(%r2),3
vl  %v4,16(%r2),3
vgmf%v6,31,31
vzero   %v2
vesraf  %v1,%v0,31
vesraf  %v3,%v4,31
vsel%v5,%v6,%v2,%v1
vsel%v7,%v6,%v2,%v3
lgr %r1,%r2
vst %v5,0(%r1),3
vst %v7,16(%r1),3
br  %r14

During a vcond expand we optimized x < 0 ? 1 : 0 into x >> 31 which we fail to
do, now.  Doing it late in combine fails, too, since we never come up with a
combination of insn 7, 8, 9, and 10:

(insn 7 6 8 2 (set (reg:V4SI 69 [ mask__5.8_4 ])
(ashiftrt:V4SI (reg:V4SI 68 [ MEM  [(int *)xx_10] ])
(const_int 31 [0x1f]))) "vcond-shift.c":155:28 905 {*ashrv4si3}
 (expr_list:REG_DEAD (reg:V4SI 68 [ MEM  [(int *)xx_10] ])
(nil)))
(insn 8 7 9 2 (set (reg:V4SI 70)
(const_vector:V4SI [
(const_int 1 [0x1]) repeated x4
])) 410 {movv4si}
 (nil))
(insn 9 8 10 2 (set (reg:V4SI 71)
(const_vector:V4SI [
(const_int 0 [0]) repeated x4
])) 410 {movv4si}
 (nil))
(insn 10 9 11 2 (set (reg:V4SI 62 [ vect_patt_18.9 ])
(if_then_else:V4SI (eq (reg:V4SI 69 [ mask__5.8_4 ])
(const_vector:V4SI [
(const_int 0 [0]) repeated x4
]))
(reg:V4SI 71)
(reg:V4SI 70))) 1265 {*vec_sel0v4si}
 (expr_list:REG_DEAD (reg:V4SI 69 [ mask__5.8_4 ])
(expr_list:REG_EQUAL (if_then_else:V4SI (eq (reg:V4SI 69 [ mask__5.8_4
])
(const_vector:V4SI [
(const_int 0 [0]) repeated x4
]))
(const_vector:V4SI [
(const_int 0 [0]) repeated x4
])
(const_vector:V4SI [
(const_int 1 [0x1]) repeated x4
]))
(nil

So maybe this is something for match.pd?

[Bug tree-optimization/115531] vectorizer generates inefficient code for masked conditional update loops

2024-06-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115531

--- Comment #4 from Richard Biener  ---
AVX512 produces

.L3:
vmovdqu8(%rsi), %zmm9{%k1}
kshiftrq$32, %k1, %k5
kshiftrq$48, %k1, %k4
movl%r9d, %eax
vmovdqu32   128(%rcx), %zmm7{%k5}
subl%esi, %eax
movl$64, %edi
vmovdqu32   128(%rdx), %zmm3{%k5}
kshiftrq$16, %k1, %k6
addl%r10d, %eax
vmovdqu32   192(%rcx), %zmm8{%k4}
cmpl%edi, %eax
vmovdqu32   192(%rdx), %zmm4{%k4}
cmova   %edi, %eax
addq$64, %rsi
addq$256, %rcx
vmovdqu32   -256(%rcx), %zmm5{%k1}
vmovdqu32   (%rdx), %zmm1{%k1}
vmovdqu32   -192(%rcx), %zmm6{%k6}
vmovdqu32   64(%rdx), %zmm2{%k6}
vpcmpb  $4, %zmm14, %zmm9, %k2
kshiftrq$32, %k2, %k3
vpblendmd   %zmm7, %zmm3, %zmm10{%k3}
kshiftrd$16, %k3, %k3
vpblendmd   %zmm8, %zmm4, %zmm0{%k3}
vpblendmd   %zmm5, %zmm1, %zmm12{%k2}
vmovdqu32   %zmm10, 128(%rdx){%k5}
kshiftrd$16, %k2, %k2
vmovdqu32   %zmm0, 192(%rdx){%k4}
vpblendmd   %zmm6, %zmm2, %zmm11{%k2}
vpbroadcastb%eax, %zmm0
movl%r9d, %eax
vmovdqu32   %zmm12, (%rdx){%k1}
subl%esi, %eax
addl%r8d, %eax
vmovdqu32   %zmm11, 64(%rdx){%k6}
addq$256, %rdx
vpcmpub $6, %zmm13, %zmm0, %k1
cmpl$64, %eax
ja  .L3


The vectorizer sees

   [local count: 955630225]:
  # i_26 = PHI 
  _1 = (long unsigned int) i_26;
  _2 = _1 * 4;
  _3 = c_17(D) + _2;
  res_18 = *_3;
  _4 = stride_14(D) + i_26;
  _5 = (long unsigned int) _4;
  _6 = _5 * 4;
  _7 = b_19(D) + _6;
  t_20 = *_7;
  _8 = a_21(D) + _1;
  _9 = *_8;
  _34 = _9 != 0;
  res_11 = _34 ? t_20 : res_18;
  *_3 = res_11;
  i_23 = i_26 + 1;
  if (n_16(D) > i_23)

I believe that to get proper vectorizer costing we want to have an
optimization phase that can take into account whether we use a masked
loop or not.  Note that your intended transform relies on identifying
the open-coded "conditional store"

  int res = c[i];
  if (a[i] != 0)
res = t;
  c[i] = res;

As Andrew says when that's a .MASK_STORE it's going to be easier to
identify the opportunity.  So yeah, if-conversion could recognize
this pattern and produce a .MASK_STORE from it as a first step.

[Bug middle-end/115530] ICE: in verify_loop_structure, at cfgloop.cc:1741 with simd attribute and tm

2024-06-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115530

Richard Biener  changed:

   What|Removed |Added

   Keywords||openmp
   Last reconfirmed||2024-06-18
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

--- Comment #2 from Richard Biener  ---
Less invalid:

__attribute__((simd)) int a()
{
  int *b = __builtin_malloc(6 * sizeof (int));
  __transaction_atomic { b[5] = 3; }
  return b[5];
}

[Bug target/115324] [12/13 Regression] PCH of rs6000 builtins broken

2024-06-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115324

--- Comment #9 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:e17114f99c9ea754787573679b3b4d2b52434b61

commit r15-1393-ge17114f99c9ea754787573679b3b4d2b52434b61
Author: Jakub Jelinek 
Date:   Tue Jun 18 08:32:37 2024 +0200

rs6000: Shrink rs6000_init_generated_builtins size [PR115324]

While my r15-1001-g4cf2de9b5268224 PCH PIE power fix change decreased the
.data section sizes (219792 -> 189336), it increased the size of already
huge rs6000_init_generated_builtins generated function, from 218328
to 228668 bytes.  That is because there are thousands of array references
to global arrays and we keep constructing the addresses of the arrays
again and again.

Ideally some optimization would figure out we have a single function which
has
461   rs6000_overload_info
   1257   rs6000_builtin_info_fntype
   1768   rs6000_builtin_decls
   2548   rs6000_instance_info_fntype
array references and that maybe it might be a good idea to just preload
the addresses of those arrays into some register if it decreases code size
and doesn't slow things down.
The function actually is called just once and is huge, so code size is even
more important than speed, which is dominated by all the GC allocations
anyway.

Until that is done, here is a slightly cleaner version of the hack, which
makes the function noipa (so that LTO doesn't undo it) for GCC 8.1+ and
passes the 4 arrays as arguments to the function from the caller.
This decreases the function size from 228668 bytes to 207572 bytes.

2024-06-18  Jakub Jelinek  

PR target/115324
* config/rs6000/rs6000-gen-builtins.cc (write_decls): Change
declaration of rs6000_init_generated_builtins from no arguments
to 4 pointer arguments.
(write_init_bif_table): Change rs6000_builtin_info_fntype to
builtin_info_fntype and rs6000_builtin_decls to builtin_decls.
(write_init_ovld_table): Change rs6000_instance_info_fntype to
instance_info_fntype, rs6000_builtin_decls to builtin_decls and
rs6000_overload_info to overload_info.
(write_init_file): Add __noipa__ attribute to
rs6000_init_generated_builtins for GCC 8.1+ and change the function
from no arguments to 4 pointer arguments.  Change
rs6000_builtin_decls
to builtin_decls.
* config/rs6000/rs6000-builtin.cc (rs6000_init_builtins): Adjust
rs6000_init_generated_builtins caller.

[Bug middle-end/115528] [15 regression] segmentation fault in legacy F77 code

2024-06-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115528

Richard Biener  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |WAITING
   Keywords||needs-bisection
   Last reconfirmed||2024-06-18

--- Comment #3 from Richard Biener  ---
We still need a testcase of some sort, or possibly bisection.

[Bug tree-optimization/101639] vectorization with bool reduction

2024-06-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101639

Richard Biener  changed:

   What|Removed |Added

 CC||max.sagebaum at scicomp dot 
uni-kl
   ||.de

--- Comment #3 from Richard Biener  ---
*** Bug 115520 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/53947] [meta-bug] vectorizer missed-optimizations

2024-06-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
Bug 53947 depends on bug 115520, which changed state.

Bug 115520 Summary: Loop vectorization fails when bool variable instead of 
unsigned char
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115520

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |DUPLICATE

[Bug tree-optimization/115520] Loop vectorization fails when bool variable instead of unsigned char

2024-06-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115520

Richard Biener  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|NEW |RESOLVED

--- Comment #5 from Richard Biener  ---
It's a dup.

*** This bug has been marked as a duplicate of bug 101639 ***

[Bug tree-optimization/101639] vectorization with bool reduction

2024-06-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101639

Richard Biener  changed:

   What|Removed |Added

   Last reconfirmed|2021-07-27 00:00:00 |2024-6-18
 CC||rguenth at gcc dot gnu.org,
   ||rsandifo at gcc dot gnu.org

--- Comment #2 from Richard Biener  ---
Re-confirmed.  To recap:

t.c:4:21: note:   === vect_determine_precisions ===
t.c:4:21: note:   using normal nonmask vectors for r_14 = PHI <_8(6), 1(5)>
t.c:4:21: note:   using boolean precision 8 for _4 = _3 != 0;
t.c:4:21: note:   using boolean precision 8 for _8 = _4 & r_14;
...
t.c:4:21: note:   ==> examining phi: r_14 = PHI <_8(6), 1(5)>
t.c:4:21: note:   get vectype for scalar type:  bool
t.c:4:21: note:   vectype: vector(16) unsigned char
..
t.c:4:21: note:   ==> examining statement: _4 = _3 != 0;
t.c:4:21: note:   vectype: vector(16) 
t.c:4:21: note:   nunits = 16
t.c:4:21: note:   ==> examining statement: _8 = _4 & r_14;
t.c:4:21: note:   vectype: vector(16) 
t.c:4:21: note:   nunits = 16

that doesn't match up.  The solution might be to realize _8 is live and
thus needed as nonmask.

[Bug middle-end/87403] [Meta-bug] Issues that suggest a new warning

2024-06-17 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87403
Bug 87403 depends on bug 84203, which changed state.

Bug 84203 Summary: add -Wsuggest-attribute=returns_nonnull
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84203

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug tree-optimization/84203] add -Wsuggest-attribute=returns_nonnull

2024-06-17 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84203

Sam James  changed:

   What|Removed |Added

 Resolution|--- |FIXED
   Target Milestone|--- |14.0
 Status|NEW |RESOLVED
 CC||hubicka at gcc dot gnu.org,
   ||sjames at gcc dot gnu.org

--- Comment #5 from Sam James  ---
.

[Bug tree-optimization/58689] [meta-bug] __attribute__((returns_nonnull)) enhancements

2024-06-17 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58689
Bug 58689 depends on bug 84203, which changed state.

Bug 84203 Summary: add -Wsuggest-attribute=returns_nonnull
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84203

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug target/115517] Fix x86 regressions after dropping uses of vcond{,u,eq}_optab

2024-06-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517

--- Comment #1 from Richard Biener  ---
Btw, I had opened PR115490 with my results for this already.  Some mitigation
should be from optimizing ISEL expansion to vcond_mask and I'd start with
looking at some of the fallout from that side (note that might require
the backend reject not natively implemented vec_cmp via its operand 1
predicate)

[Bug c/115516] constexpr use before C23 could give a better error message

2024-06-17 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115516

--- Comment #3 from Jakub Jelinek  ---
Sorry, this was about C, not C++.  And apparently C++ manages to warn about
that in
cp_parser_diagnose_invalid_type_name through
if (cxx_dialect < cxx11 && id == ridpointers[(int)RID_CONSTEXPR])
check.

[Bug c/115516] constexpr use before C23 could give a better error message

2024-06-17 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115516

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek  ---
It will be hard, because constexpr before C++11 is not a keyword.
And it can't be, int constexpr = 1; is completely valid C++98.

[Bug c/115532] New: Small documentation fixes for -Wsuggest-attribute=returns_nonnull

2024-06-17 Thread peter at eisentraut dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115532

Bug ID: 115532
   Summary: Small documentation fixes for
-Wsuggest-attribute=returns_nonnull
   Product: gcc
   Version: 14.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: peter at eisentraut dot org
  Target Milestone: ---

Created attachment 58457
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58457&action=edit
patch with fixes

The documentation for -Wsuggest-attribute=returns_nonnull (new in gcc 14,
commit 53ba8d66955) contains a few small mistakes.

See
https://gcc.gnu.org/onlinedocs/gcc-14.1.0/gcc/Warning-Options.html#index-Wsuggest-attribute_003d
for the current rendering.

1. Wrong punctuation:
"-Wsuggest-attribute=[pure|const|noreturn|format|cold|malloc]returns_nonnull|"
(note "]" after "malloc").

2. Not added to the overview earlier in the page (also "cold" was missing
there).

3. Not added to the index. (Instead, the "no" form was added as in @itemx,
which is inconsistent with the other entries.)

I'm attaching a patch with these three things fixed.  (Note, I have not built
the documentation with this.)

[Bug c/115516] constexpr use before C23 could give a better error message

2024-06-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115516

Richard Biener  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2024-06-18
 Status|UNCONFIRMED |NEW

--- Comment #1 from Richard Biener  ---
Confirmed.

[Bug tree-optimization/111793] OpenMP SIMD inbranch clones for AVX512 are highly sub-optimal

2024-06-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111793

--- Comment #8 from Richard Biener  ---
The committed fix should resolve the stray masked stores observed but not the
inefficiency dealing with incoming AVX512 mask arguments.

[Bug tree-optimization/84203] add -Wsuggest-attribute=returns_nonnull

2024-06-17 Thread peter at eisentraut dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84203

Peter Eisentraut  changed:

   What|Removed |Added

 CC||peter at eisentraut dot org

--- Comment #4 from Peter Eisentraut  ---
This is implemented in gcc 14:

commit 53ba8d66955
Author: Jan Hubicka 
Date:   Mon Nov 20 19:35:53 2023

inter-procedural value range propagation

implement very basic propapgation of return value ranges from VRP
pass.  This helps std::vector's push_back since we work out value range of
allocated block.  This propagates only within single translation unit.  I
hoped
we will also do the propagation at WPA stage, but that needs more work on
ipa-cp side.

I also added code auto-detecting return_nonnull and corresponding
-Wsuggest-attribute.

[It is misspelled in the commit message.]

[Bug target/115493] [15 regression] gcc.c-torture/execute/pr94734.c fails on MVE after r15-1054-g202a9c8fe7d

2024-06-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115493

Richard Biener  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #9 from Richard Biener  ---
Should be fixed now.

[Bug target/115493] [15 regression] gcc.c-torture/execute/pr94734.c fails on MVE after r15-1054-g202a9c8fe7d

2024-06-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115493

--- Comment #8 from GCC Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:19258ca1b85bc15e3a49054eff209f4f0d1c5bee

commit r15-1392-g19258ca1b85bc15e3a49054eff209f4f0d1c5bee
Author: Richard Biener 
Date:   Mon Jun 17 16:01:15 2024 +0200

tree-optimization/115493 - fix wrong code with SLP induction cond reduction

The following fixes a bad final value being used when doing single-lane
SLP integer induction cond reduction vectorization.

PR tree-optimization/115493
* tree-vect-loop.cc (vect_create_epilog_for_reduction): Use
the first scalar result.

[Bug tree-optimization/111793] OpenMP SIMD inbranch clones for AVX512 are highly sub-optimal

2024-06-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111793

--- Comment #7 from GCC Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:4b75ed33fa5fd604897e7a30e79bd28d46598373

commit r15-1391-g4b75ed33fa5fd604897e7a30e79bd28d46598373
Author: Richard Biener 
Date:   Fri Jun 14 14:46:08 2024 +0200

Enhance if-conversion for automatic arrays

Automatic arrays that are not address-taken should not be subject to
store data races.  This applies to OMP SIMD in-branch lowered
functions result array which for the testcase otherwise prevents
vectorization with SSE and for AVX and AVX512 ends up with spurious
.MASK_STORE to the stack surviving.

This inefficiency was noted in PR111793.

I've introduced ref_can_have_store_data_races, commonizing uses
of flag_store_data_races in if-conversion, cselim and store motion.

PR tree-optimization/111793
* tree-ssa-alias.h (ref_can_have_store_data_races): Declare.
* tree-ssa-alias.cc (ref_can_have_store_data_races): New
function.
* tree-if-conv.cc (ifcvt_memrefs_wont_trap): Use
ref_can_have_store_data_races to allow more unconditional
stores.
* tree-ssa-loop-im.cc (execute_sm): Likewise.
* tree-ssa-phiopt.cc (cond_store_replacement): Likewise.

* gcc.dg/vect/vect-simd-clone-21.c: New testcase.

[Bug tree-optimization/115531] vectorizer generates inefficient code for masked conditional update loops

2024-06-17 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115531

--- Comment #3 from Tamar Christina  ---
(In reply to Andrew Pinski from comment #1)
> I suspect PR 20999 would fix this ...
> but we have to be careful since without masked stores, you could still
> vectorize this unlike the transformed version.
> 
> Maybe ifcvt can produce a masked store version if this pattern ...

doing so during ifcvt forces you to commit to a masked operation. So you loose
the ability to not vectorize for non-fully masked architectures.

So it's too early.  A vector pattern doesn't have this problem. This question
was mostly to what degree the vectorizer has support for MASK_STORE as an
input. vect_get_vector_types_for_stmt seems to have support for it so it looks
like it may work.

[Bug tree-optimization/115531] vectorizer generates inefficient code for masked conditional update loops

2024-06-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115531

--- Comment #2 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #1)
> Maybe ifcvt can produce a masked store version if this pattern ...

Maybe add another argument to .MASK_STORE to say it was originally
unconditional store? Or something like that.

[Bug tree-optimization/115531] vectorizer generates inefficient code for masked conditional update loops

2024-06-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115531

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement

[Bug tree-optimization/115531] vectorizer generates inefficient code for masked conditional update loops

2024-06-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115531

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2024-06-18
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=20999
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

--- Comment #1 from Andrew Pinski  ---
I suspect PR 20999 would fix this ...
but we have to be careful since without masked stores, you could still
vectorize this unlike the transformed version.

Maybe ifcvt can produce a masked store version if this pattern ...

[Bug tree-optimization/115531] New: vectorizer generates inefficient code for masked conditional update loops

2024-06-17 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115531

Bug ID: 115531
   Summary: vectorizer generates inefficient code for masked
conditional update loops
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tnfchris at gcc dot gnu.org
  Target Milestone: ---

The following code:

void __attribute__((noipa))
foo (char *restrict a, int *restrict b, int *restrict c, int n, int stride)
{
  if (stride <= 1)
return;

  for (int i = 0; i < n; i++)
{
  int res = c[i];
  int t = b[i+stride];
  if (a[i] != 0)
res = t;
  c[i] = res;
}
}

generates at -O3 -g0 -mcpu=generic+sve:

.L3:
ld1bz29.s, p7/z, [x0, x5]
ld1wz31.s, p7/z, [x2, x5, lsl 2]
ld1wz30.s, p7/z, [x1, x5, lsl 2]
cmpne   p15.b, p6/z, z29.b, #0
sel z30.s, p15, z30.s, z31.s
st1wz30.s, p7, [x2, x5, lsl 2]
add x5, x5, x4
whilelo p7.s, w5, w3
b.any   .L3
.L1:

and makes vectorization unprofitable until very high iterations of n.
This is because the vector code has more instructions than needed.

Since it's a masked store, whenever a value is being conditionally set we don't
need the intermediate VEC_COND_EXPR.  This loop can be vectorized as:

.L3:
ld1bz29.s, p7/z, [x0, x5]
ld1wz31.s, p7/z, [x2, x5, lsl 2]
cmpne   p4.b, p6/z, z29.b, #0
st1wz31.s, p4, [x2, x5, lsl 2]
add x5, x5, x4
whilelo p7.s, w5, w3
b.any   .L3
.L1:

I currently prototyped a load-to-store forward optimization in forwprop but
looking to move it into the vectorizer to cost it properly, however I'm not
entirely sure what the best way to do so is.

I can certainly fix it up during codegen but to cost it I need to do so during
analysis. I could detect it during vectorizable_condition but then the dead
load is still costed. Or I could maybe use a pattern, but unsure how to
represent the mask into the load.

Is it valid to produce a pattern with .IFN_MASK_STORE?

[Bug testsuite/114842] rs6000: Adjust some test cases with powerpc_vsx_ok

2024-06-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114842

--- Comment #5 from GCC Commits  ---
The master branch has been updated by Kewen Lin :

https://gcc.gnu.org/g:792ebb073252d2a4cecb0df23b6b702a8c55eec5

commit r15-1390-g792ebb073252d2a4cecb0df23b6b702a8c55eec5
Author: Kewen Lin 
Date:   Mon Jun 17 21:46:53 2024 -0500

testsuite, rs6000: Replace powerpc_altivec_ok with powerpc_altivec
[PR114842]

As noted in PR114842, most of the test cases which require
effective target check powerpc_altivec_ok actually care about
if ALTIVEC feature is enabled, and they should adopt effective
target powerpc_altivec instead.  Otherwise, when users are
specifying extra option -mno-altivec in RUNTESTFLAGS, the check
powerpc_altivec_ok returns true then the test case would be
tested without ALTIVEC so it would fail.  With commit r15-716,
dg-options and dg-additional-options can be taken into account
when evaluating powerpc_altivec, so this patch also moves
dg-{additional,}-options lines before dg-require-effective-target
to make it effective.

PR testsuite/114842

gcc/testsuite/ChangeLog:

* c-c++-common/pr72747-1.c: Replace powerpc_altivec_ok with
powerpc_altivec, move dg-options and dg-additional-options lines
before dg-require-effective-target lines when it doesn't cause
any side effect like note message.
* c-c++-common/pr72747-2.c: Likewise.
* g++.dg/torture/pr79905.C: Likewise.
* g++.target/powerpc/altivec-1.C: Likewise.
* g++.target/powerpc/altivec-10.C: Likewise.
* g++.target/powerpc/altivec-11.C: Likewise.
* g++.target/powerpc/altivec-12.C: Likewise.
* g++.target/powerpc/altivec-13.C: Likewise.
* g++.target/powerpc/altivec-14.C: Likewise.
* g++.target/powerpc/altivec-15.C: Likewise.
* g++.target/powerpc/altivec-16.C: Likewise.
* g++.target/powerpc/altivec-17.C: Likewise.
* g++.target/powerpc/altivec-18.C: Likewise.
* g++.target/powerpc/altivec-2.C: Likewise.
* g++.target/powerpc/altivec-4.C: Likewise.
* g++.target/powerpc/altivec-5.C: Likewise.
* g++.target/powerpc/altivec-6.C: Likewise.
* g++.target/powerpc/altivec-7.C: Likewise.
* g++.target/powerpc/altivec-8.C: Likewise.
* g++.target/powerpc/altivec-9.C: Likewise.
* g++.target/powerpc/altivec-cell-1.C: Likewise.
* g++.target/powerpc/altivec-cell-5.C: Likewise.
* g++.target/powerpc/altivec-types-1.C: Likewise.
* g++.target/powerpc/altivec-types-2.C: Likewise.
* g++.target/powerpc/altivec-types-3.C: Likewise.
* g++.target/powerpc/altivec-types-4.C: Likewise.
* gcc.target/powerpc/altivec-1-runnable.c: Likewise.
* gcc.target/powerpc/altivec-11.c: Likewise.
* gcc.target/powerpc/altivec-13.c: Likewise.
* gcc.target/powerpc/altivec-14.c: Likewise.
* gcc.target/powerpc/altivec-15.c: Likewise.
* gcc.target/powerpc/altivec-16.c: Likewise.
* gcc.target/powerpc/altivec-17.c: Likewise.
* gcc.target/powerpc/altivec-18.c: Likewise.
* gcc.target/powerpc/altivec-19.c: Likewise.
* gcc.target/powerpc/altivec-2.c: Likewise.
* gcc.target/powerpc/altivec-21.c: Likewise.
* gcc.target/powerpc/altivec-22.c: Likewise.
* gcc.target/powerpc/altivec-23.c: Likewise.
* gcc.target/powerpc/altivec-25.c: Likewise.
* gcc.target/powerpc/altivec-26.c: Likewise.
* gcc.target/powerpc/altivec-27.c: Likewise.
* gcc.target/powerpc/altivec-28.c: Likewise.
* gcc.target/powerpc/altivec-29.c: Likewise.
* gcc.target/powerpc/altivec-30.c: Likewise.
* gcc.target/powerpc/altivec-31.c: Likewise.
* gcc.target/powerpc/altivec-32.c: Likewise.
* gcc.target/powerpc/altivec-33.c: Likewise.
* gcc.target/powerpc/altivec-34.c: Likewise.
* gcc.target/powerpc/altivec-35.c: Likewise.
* gcc.target/powerpc/altivec-36.c: Likewise.
* gcc.target/powerpc/altivec-4.c: Likewise.
* gcc.target/powerpc/altivec-5.c: Likewise.
* gcc.target/powerpc/altivec-6.c: Likewise.
* gcc.target/powerpc/altivec-7.c: Likewise.
* gcc.target/powerpc/altivec-8.c: Likewise.
* gcc.target/powerpc/altivec-9.c: Likewise.
* gcc.target/powerpc/altivec-cell-1.c: Likewise.
* gcc.target/powerpc/altivec-cell-5.c: Likewise.
* gcc.target/powerpc/altivec-cell-6.c: Likewise.
* gcc.target/powerpc/altivec-cell-7.c: Likewise.
* gcc.target/powerpc/altivec-perm-1.c: Likewise.
* gcc.target/powerpc/altivec-perm-2.c: Likewise.
   

[Bug middle-end/115530] ICE: in verify_loop_structure, at cfgloop.cc:1741 with simd attribute and tm

2024-06-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115530

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||ice-on-valid-code

--- Comment #1 from Andrew Pinski  ---
-fgnu-tm is not well maintained but thanks for finding this bug.

[Bug c/115530] New: ICE: in verify_loop_structure, at cfgloop.cc:1741 with simd attribute

2024-06-17 Thread iamanonymous.cs at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115530

Bug ID: 115530
   Summary: ICE: in verify_loop_structure, at cfgloop.cc:1741 with
simd attribute
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: iamanonymous.cs at gmail dot com
  Target Milestone: ---

Compiler Explorer: https://godbolt.org/z/c4qPbxcr6

***
OS and Platform:
$ uname -a:
Linux ubuntu 4.15.0-213-generic #224-Ubuntu SMP Mon Jun 19 13:30:12 UTC 2023
x86_64 x86_64 x86_64 GNU/Linux
***
gcc version:
Using built-in specs.
COLLECT_GCC=/root/gcc_set/trunk-48a320a/bin/gcc
COLLECT_LTO_WRAPPER=/root/gcc_set/trunk-48a320a/libexec/gcc/x86_64-pc-linux-gnu/15.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc/configure --prefix=/root/gcc_set/trunk-48a320a
--with-gmp=/root/build_essential --with-mpfr=/root/build_essential
--with-mpc=/root/build_essential --enable-languages=c,c++ --disable-multilib
--with-sanitizer=address,undefined,thread,leak --enable-coverage
--disable-bootstrap
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 15.0.0 20240426 (experimental) (GCC) 

***
Program:
$ cat mutant.c
void *malloc();
__attribute__((simd)) int a() {
  int *b = malloc();
  __transaction_atomic { b[5] = 3; }
  return b[5];
}

***
Command Lines:
$ gcc -fgnu-tm -O mutant.c
mutant.c: In function ‘a’:
mutant.c:3:12: warning: too few arguments to built-in function ‘malloc’
expecting 1 [-Wbuiltin-declaration-mismatch]
3 |   int *b = malloc();
  |^~
mutant.c:1:7: note: declared here
1 | void *malloc();
  |   ^~
mutant.c: In function ‘a.simdclone.0’:
mutant.c:2:27: error: size of loop 1 should be 8, not 9
2 | __attribute__((simd)) int a() {
  |   ^
during GIMPLE pass: tmmark
mutant.c:2:27: internal compiler error: in verify_loop_structure, at
cfgloop.cc:1741
0x118afec verify_loop_structure()
../../gcc/gcc/cfgloop.cc:1741
0x1b99d62 execute_function_todo
../../gcc/gcc/passes.cc:2105
0x1b97b22 do_per_function
../../gcc/gcc/passes.cc:1688
0x1b9a04c execute_todo
../../gcc/gcc/passes.cc:2143
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.

[Bug tree-optimization/115529] Optimization with "bit mask and compare equality" ((x & C1) == C2), ((x | C3) == C4)

2024-06-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115529

--- Comment #1 from Andrew Pinski  ---
>((x | C3) == C4)
shows up in PR 86975, see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86975#c2
.

[Bug tree-optimization/115529] New: Optimization with "bit mask and compare equality" ((x & C1) == C2), ((x | C3) == C4)

2024-06-17 Thread Explorer09 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115529

Bug ID: 115529
   Summary: Optimization with "bit mask and compare equality" ((x
& C1) == C2), ((x | C3) == C4)
   Product: gcc
   Version: 14.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Explorer09 at gmail dot com
  Target Milestone: ---

I thought this pattern is commonly seen, but I didn't find any issue report in
GCC mentioning this.

```c
#include 

bool pred_bitor(unsigned int x) {
return (x | 0x1F) == 0x381F;
}
bool pred_bitand(unsigned int x) {
return (x & ~(uint32_t)0x1F) == 0x3800;
}
bool pred_shift(unsigned int x) {
return (x >> 5) == (0x3800 >> 5);
}
```

The three predicates are equivalent, but GCC doesn't seem to recognize one can
be converted to another. Clang does recognize that, however.

My optimization request is this:
For the patterns ((x & C1) == C2) and ((x | C3) == C4), where C1 to C4 are all
compile time constants, try converting one code pattern to another, and figure
out which can generate faster or smaller code (or maybe both).

* If any constant happens to be already loaded into one register, then reusing
that constant can save code size by not needing to load the constant into
register for compare op. In other words, which pattern is the best depends a
lot on the surrounding code, as well is CPU instruction sets.

* My personal testing says that pred_bitand() can win in more cases than the
other two:
** In 32-bit ARM, the constant 0x3800 can fit into an immediate operand but
0x381F cannot, which means there would be an additional "mov" instruction for
0x381F.
** In RISC-V, the "andi" has a 16-bit encoding (with the "C"/Compressed
instruction extension) that works here, but there is no 16-bit encoding for
"ori" instruction.

* The pred_shift() case can work only if the mask constant is in the `((1 << N)
- 1)` form, or `((unsigned)-1 << M)` form. It doesn't work with all mask
constants but I think you knew that already.

[Bug tree-optimization/97405] ICE in get_or_alloc_expr_for in code hoisting with SVE intrinsics

2024-06-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97405

Andrew Pinski  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
   Target Milestone|--- |15.0
 Resolution|--- |FIXED

--- Comment #9 from Andrew Pinski  ---
Fixed.

[Bug tree-optimization/97405] ICE in get_or_alloc_expr_for in code hoisting with SVE intrinsics

2024-06-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97405

--- Comment #8 from GCC Commits  ---
The trunk branch has been updated by Andrew Pinski :

https://gcc.gnu.org/g:17979deb15d34dd4f036ca93d2977d0fc4d556a7

commit r15-1387-g17979deb15d34dd4f036ca93d2977d0fc4d556a7
Author: Andrew Pinski 
Date:   Mon Jun 17 16:45:34 2024 -0700

aarch64: Add testcase for PR97405

This aarch64 sve specific code was fixed by r15-917-gc9842f99042454
which added a riscv specific testcase so adding an aarch64 one to test
the fix does not regress is a good idea.

Committed as obvious after testing the testcase for aarch64-linux-gnu.

PR tree-optimization/97405

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/pr97405-1.c: New test.

Signed-off-by: Andrew Pinski 

[Bug tree-optimization/97405] ICE in get_or_alloc_expr_for in code hoisting with SVE intrinsics

2024-06-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97405

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=115214
 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |pinskia at gcc dot 
gnu.org

--- Comment #7 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #6)
> Fixed on the trunk, I suspect r15-917-gc9842f99042454 fixed it.
> That revision is explictly fixing `POLY_INT_CST [16, 16] /[ex] 16` case too.
> 
> I think we should just add the testcase and close it as fixed.

yes that fixed it, will commit a testcase in a little bit.

[Bug c++/115501] [13/14/15 Regression] ICE: in build_call_a with dynamic_cast after invalid definition of __cxxabiv1::__dynamic_cast since r13-3299

2024-06-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115501

Andrew Pinski  changed:

   What|Removed |Added

 CC||pinskia at gcc dot gnu.org

--- Comment #7 from Andrew Pinski  ---
(In reply to Marek Polacek from comment #6)
> build_dynamic_cast_1 now calls pushdecl which calls duplicate_decls and that
> emits the "conflicting declaration" error and returns error_mark_node, so
> the subsequent build_cxx_call crashes on the error_mark_node.
> 
> Maybe we need just:
> 
> --- a/gcc/cp/rtti.cc
> +++ b/gcc/cp/rtti.cc
> @@ -793,6 +793,8 @@ build_dynamic_cast_1 (location_t loc, tree type, tree
> expr,
>   dcast_fn = pushdecl (dcast_fn, /*hiding=*/true);
>   pop_abi_namespace (flags);
>   dynamic_cast_node = dcast_fn;
> + if (dcast_fn == error_mark_node)
> +   return error_mark_node;
> }
>   result = build_cxx_call (dcast_fn, 4, elems, complain);
>   SET_EXPR_LOCATION (result, loc);

Most likely that check should be after the `!dcast_fn` check rather than inside
it so if you try to use dynamic_cast twice, the second one would not cause an
ICE.

[Bug middle-end/115527] -ftrivial-auto-var-init appears to clobber explicit initializer

2024-06-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115527

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2024-06-17
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #1 from Andrew Pinski  ---
Confirmed. It looks like the option is emitting the code afterwards rather than
before the initializer ...

[Bug target/115408] regression between gcc 13.3.0 and 14.1.0 using -mips16 and -minterlink-mips16

2024-06-17 Thread broly at mac dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115408

--- Comment #10 from gagan sidhu (broly)  ---
Created attachment 58456
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58456&action=edit
findutils 4.10 failure

[Bug fortran/115528] [15 regression] segmentation fault in legacy F77 code

2024-06-17 Thread juergen.reuter at desy dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115528

--- Comment #2 from Jürgen Reuter  ---
(In reply to Andrew Pinski from comment #1)
> what options are you using to compile the source?
> Does it work at -O0?

You are right: the problem doesn't appear for -O0. Our defaults are the libtool
defaults of -g -O2.

[Bug fortran/115390] Bogus -Wuninitialized warning when using CHARACTER(*) argument in BIND(C) function

2024-06-17 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115390

anlauf at gcc dot gnu.org changed:

   What|Removed |Added

 CC||anlauf at gcc dot gnu.org

--- Comment #3 from anlauf at gcc dot gnu.org ---
Created attachment 58455
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58455&action=edit
Patch

The attached patch fixes the ordering such that the bogus warning does no
longer appear and gives for the reduced test:

__attribute__((fn spec (". r ")))
void bar (struct CFI_cdesc_t00 & restrict _s)
{
  integer(kind=8) s.0;
  character(kind=1)[1:s.0] * s;
  bitsizetype D.4279;
  sizetype D.4280;

  s.0 = (integer(kind=8)) _s->elem_len;
  D.4279 = (bitsizetype) (sizetype) NON_LVALUE_EXPR  * 8;
  D.4280 = (sizetype) NON_LVALUE_EXPR ;
  s = (character(kind=1)[1:s.0] *) _s->base_addr;
  foo ((character(kind=1)[1:s.0] *) s, s.0);
}

Currently regtesting ...

[Bug fortran/115528] [15 regression] segmentation fault in legacy F77 code

2024-06-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115528

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||wrong-code
   Target Milestone|--- |15.0

--- Comment #1 from Andrew Pinski  ---
what options are you using to compile the source?
Does it work at -O0?

[Bug fortran/115528] New: [15 regression] segmentation fault in legacy F77 code

2024-06-17 Thread juergen.reuter at desy dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115528

Bug ID: 115528
   Summary: [15 regression] segmentation fault in legacy F77 code
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: juergen.reuter at desy dot de
  Target Milestone: ---

Some changes in gcc/gfortran between ca. June 10 and June 17, 2024 now leeds to
segmenation faults in our application (Whizard v3.1.4, c.f.
http://whizard.hepforge.org/whizard-3.1.4.tar.gz). Note that this appears in
our functional testsuite which necessitates the OCaml compiler. The
segmentation fault appears e.g. as 
#0  0x7f827594f51f in ???
at ./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0
#1  0x7f827a3cfd0a in curr_
at ../../../contrib/tauola/formf.f:599
#2  0x7f827a3e0e16 in dam4pi_
at ../../../contrib/tauola/tauola.f:4106
#3  0x7f827a3e1fb6 in dph4pi_
at ../../../contrib/tauola/tauola.f:4067
#4  0x7f827a3e6510 in dadnew_
at ../../../contrib/tauola/tauola.f:3667
#5  0x7f827a3e685c in dexnew_
at ../../../contrib/tauola/tauola.f:3592
#6  0x7f827a3ecc2d in dexay_
at ../../../contrib/tauola/tauola.f:525
#7  0x7f827a3eeffa in initdk_
at ../../../contrib/tauola/tauola_photos_ini.f:452

I need to come up with a workable reproducer, but that's not easy, and my time
nowadays is awfully limited. :(

[Bug target/115526] [14/15 regression] invalid assember emitted for alpha, "Error: duplicate !tlsgd!62"

2024-06-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115526

--- Comment #3 from Andrew Pinski  ---

beq $10,$L293
lda $13,_Py_tss_tstate($29) !tlsgd!62
...


$L293:
lda $16,1($31)
ldq $27,Balloc($29) !literal!78
jsr $26,($27),0 !lituse_jsr!78
ldah $29,0($26) !gpdisp!79
lda $29,0($29)  !gpdisp!79
mov $0,$10
beq $0,$L260
lda $1,625($31)
stl $1,24($0)
lda $1,1($31)
stl $1,20($0)
stq $0,5288($12)
lda $13,_Py_tss_tstate($29) !tlsgd!62
stq $31,0($0)

[Bug rtl-optimization/114729] RISC-V SPEC2017 507.cactu excessive spillls with -fschedule-insns

2024-06-17 Thread vineetg at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114729

--- Comment #12 from Vineet Gupta  ---
Interim update as I unpack sched1.

There's the "main" scheduling algorithm which involves 4 queues and an FSM but
can be ignored for this update.

There's "model" schedule which is the pressure sensitive sub-algorithm and can
be considered separate from the main one.

Note that per comments in src code:
   "We do not apply this model schedule to the rtx stream.  We simply record
   it in model_schedule.  We also compute the maximum pressure,
   MP, that was seen during this schedule."

@curr_reg_pressure[GR_REGS] is initially setup with df live info.

schedule_block 
  model_start_schedule
   initiate_reg_pressure_info (df_get_live_in (bb))
mark_regno_birth_or_death (curr_reg_live, curr_reg_pressure, j,
birth_p=true);

model_worklist is seeded with insns with dep=0 which is then iteratively
processed with deps SD_LIST_HARD_BACK insns.

 while (model_worklist)
   model_choose_insn ()

   insn = model_worklist
   for (;;)
   {
if (model_classify_pressure(insn) < 0)
break;
insn = insn->next
   }
   update_register_pressure(insn)

insn 35 is starting point of investigation, a load insn which creates pseudo
r170 used by the later insns involved in spill.

;;  +--- worklist:
;;  +---   35 [2, 7, 2, 10]
...
;;  |  14   35 |272   10 | r170#0=[r242+low(`u')]
GR_REGS:[26,+1]
  
Note that the prints are slightly misleading as it prints prev pressure: +N
shows  the pressure increment due to this insn

  model_record_pressures (insn);   <-- prints curr_reg_pressure[]
  update_register_pressure (insn->insn);   <-- updates curr_reg_pressure[] due
to insn

Looking back at comment 10, the insns we want to really track are

  ;;|   46 |7 | r180=zxt(r170,0x10,0x10)  
  ;;|   54 |6 | [r230+low(const(`_Z1sv'+0x2))]=r180#0
  ;;|   55 |7 | r188=r170 0>>0x20 
  ;;|   64 |6 | [r230+low(const(`_Z1sv'+0x4))]=r188#0

(the above is pre-schedule/ideal, but what we see in the end is 46 and 55 are
next to each other, needing seperate hard regs for r180 and r188)

Anyhow, model schedule is recording that we do hit 28 which one more than the
max reg pressure 27 for GR_REGS.

;; |  16   46 | 2 7 3 7 | r180=zxt(r170,0x10,0x10)  
GR_REGS:[27,+1]
;; |  17   54 | 2 7 4 6 | [r230+low(const(`_Z1sv'+0x2))]=r180#0 
GR_REGS:[28,-1]
;; |  18   55 | 2 7 3 7 | r188=r170 0>>0x20 
GR_REGS:[27,+1]
;; |  19   64 | 2 7 4 6 | [r230+low(const(`_Z1sv'+0x4))]=r188#0 
GR_REGS:[28,-1]
..
;; Pressure summary: GR_REGS:28 FP_REGS:1


As a hack I did switch the recording/updating (to record the right pressure
point)

  update_register_pressure (insn->insn);
  model_record_pressures (insn); 

That does move insn stream around, but still doesn't separate insn 46 and 55
enough to not allocate new hard reg for latter.

In summary the model pressure calculation seems fine, need to see why main algo
is not switching things around.

[Bug target/115526] [14/15 regression] invalid assember emitted for alpha, "Error: duplicate !tlsgd!62"

2024-06-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115526

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |14.2

[Bug driver/115440] unrecognized command-line option '--c++17'; did you mean '--stdc++17'?

2024-06-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115440

--- Comment #4 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:96db57948b50f45235ae4af3b46db66cae7ea859

commit r15-1384-g96db57948b50f45235ae4af3b46db66cae7ea859
Author: Jakub Jelinek 
Date:   Mon Jun 17 22:02:46 2024 +0200

diagnostics: Fix add_misspelling_candidates [PR115440]

The option_map array for most entries contains just non-NULL opt0
{ "-Wno-", NULL, "-W", false, true },
{ "-fno-", NULL, "-f", false, true },
{ "-gno-", NULL, "-g", false, true },
{ "-mno-", NULL, "-m", false, true },
{ "--debug=", NULL, "-g", false, false },
{ "--machine-", NULL, "-m", true, false },
{ "--machine-no-", NULL, "-m", false, true },
{ "--machine=", NULL, "-m", false, false },
{ "--machine=no-", NULL, "-m", false, true },
{ "--machine", "", "-m", false, false },
{ "--machine", "no-", "-m", false, true },
{ "--optimize=", NULL, "-O", false, false },
{ "--std=", NULL, "-std=", false, false },
{ "--std", "", "-std=", false, false },
{ "--warn-", NULL, "-W", true, false },
{ "--warn-no-", NULL, "-W", false, true },
{ "--", NULL, "-f", true, false },
{ "--no-", NULL, "-f", false, true }
and so add_misspelling_candidates works correctly for it, but 3 out of
these,
{ "--machine", "", "-m", false, false },
{ "--machine", "no-", "-m", false, true },
and
{ "--std", "", "-std=", false, false },
use non-NULL opt1.  That says that
--machine foo
should map to
-mfoo
and
--machine no-foo
should map to
-mno-foo
and
--std c++17
should map to
-std=c++17
add_misspelling_canidates was not handling this, so it hapilly
registered say
--stdc++17
or
--machineavx512
(twice) as spelling alternatives, when those options aren't recognized.
Instead we support
--std c++17
or
--machine avx512
--machine no-avx512

The following patch fixes that.  On this particular testcase, we no longer
suggest anything, even when among the suggestion is say that
--std c++17
or
-std=c++17
etc.

2024-06-17  Jakub Jelinek  

PR driver/115440
* opts-common.cc (add_misspelling_candidates): If opt1 is non-NULL,
add a space and opt1 to the alternative suggestion text.

* g++.dg/cpp1z/pr115440.C: New test.

[Bug c/115527] New: -ftrivial-auto-var-init appears to clobber explicit initializer

2024-06-17 Thread nora at norasandler dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115527

Bug ID: 115527
   Summary: -ftrivial-auto-var-init appears to clobber explicit
initializer
   Product: gcc
   Version: 13.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: nora at norasandler dot com
  Target Milestone: ---

This program behaves incorrectly when compiled with -ftrivial-auto-var=pattern
or -ftrivial-auto-var=zero:

// Bug in GCC -ftrivial-auto-var
struct inner {
double a;
char b;
long l;
};

struct outer {
struct inner in_array[3];
int bar;
};

int main(void) {
struct inner si = {1., 0, 0};
struct outer o = {// in_array
  {
  si,
  {0, 0, 0},
  {0, 0, 0},
  },
  // bar
  0};

if (o.in_array[0].a != 1.) {
return 1; // fail
}
return 0;
}

Without the -ftrivial-auto-var option, this returns 0, as expected. With
-ftrivial-auto-var=pattern or -ftrivial-auto-var=zero it returns 1:


$ gcc -ftrivial-auto-var-init=pattern auto-var-bug.c
$ ./a.out
$ echo $?
1
$ gcc auto-var-bug.c
$ ./a.out
$ echo $?
0

I saw this behavior in GCC 13.1.0, 13.2.0, and 14.0.1 (I haven't tried it with
other versions).

Full version info for the version of 13.2.0 I tested this with:
$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-linux-gnu/13/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu
13.2.0-23ubuntu4' --with-bugurl=file:///usr/share/doc/gcc-13/README.Bugs
--enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --prefix=/usr
--with-gcc-major-version-only --program-suffix=-13
--program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id
--libexecdir=/usr/libexec --without-included-gettext --enable-threads=posix
--libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug
--enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new
--enable-libstdcxx-backtrace --enable-gnu-unique-object --disable-vtable-verify
--enable-plugin --enable-default-pie --with-system-zlib
--enable-libphobos-checking=release --with-target-system-zlib=auto
--enable-objc-gc=auto --enable-multiarch --disable-werror --enable-cet
--with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32
--enable-multilib --with-tune=generic
--enable-offload-targets=nvptx-none=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-gcn/usr
--enable-offload-defaulted --without-cuda-driver --enable-checking=release
--build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 13.2.0 (Ubuntu 13.2.0-23ubuntu4)

[Bug c++/115377] [11/12/13/14/15 Regression] Invalid typename with non nested-name-specifier accepted in specific contexts since r5-2683

2024-06-17 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115377

--- Comment #4 from Marek Polacek  ---
Patch thread: https://gcc.gnu.org/pipermail/gcc-patches/2014-August/395789.html

The patch doesn't seem to match what was actually committed?

[Bug c++/115501] [13/14/15 Regression] ICE: in build_call_a with dynamic_cast after invalid definition of __cxxabiv1::__dynamic_cast since r13-3299

2024-06-17 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115501

Marek Polacek  changed:

   What|Removed |Added

 CC||mpolacek at gcc dot gnu.org

--- Comment #6 from Marek Polacek  ---
build_dynamic_cast_1 now calls pushdecl which calls duplicate_decls and that
emits the "conflicting declaration" error and returns error_mark_node, so the
subsequent build_cxx_call crashes on the error_mark_node.

Maybe we need just:

--- a/gcc/cp/rtti.cc
+++ b/gcc/cp/rtti.cc
@@ -793,6 +793,8 @@ build_dynamic_cast_1 (location_t loc, tree type, tree expr,
  dcast_fn = pushdecl (dcast_fn, /*hiding=*/true);
  pop_abi_namespace (flags);
  dynamic_cast_node = dcast_fn;
+ if (dcast_fn == error_mark_node)
+   return error_mark_node;
}
  result = build_cxx_call (dcast_fn, 4, elems, complain);
  SET_EXPR_LOCATION (result, loc);

[Bug libstdc++/115522] [13/14/15 Regression] std::to_array no longer works for struct which is trivial but not default constructible

2024-06-17 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115522

--- Comment #4 from Jonathan Wakely  ---
(In reply to Jonathan Wakely from comment #3)
>static_assert(is_constructible_v<_Tp, _Tp&>);
>if constexpr (is_constructible_v<_Tp, _Tp&>)
> {
> - if constexpr (is_trivial_v<_Tp>)
> + if constexpr (is_trivial_v<_Tp> && is_default_constructible_v<_Tp>
> +  && is_copy_assignable_v<_Tp>)

For the testcase above, it would be sufficient to do:

 if constexpr (is_trivial_v<_Tp> && is_copy_assignable_v<_Tp>)

The type with const members isn't assignable.

I'm not sure if this covers all cases. I think combined with is_trivial it
should do, but then I don't understand why the struct ranges type is trivial in
the first place.

[Bug tree-optimization/115520] Loop vectorization fails when bool variable instead of unsigned char

2024-06-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115520

--- Comment #4 from Andrew Pinski  ---
(In reply to Max S. from comment #3)
> Ok, thanks for the hint with SRA. Know I know how to actually program it. 
> 
> I think the main problem is the warning/error message:
> > not vectorized: relevant phi not supported: matched_21 = PHI <_20(6), 1(5)>
> 
> Is there some way to improve this?

See PR 101639 which I think this is basically a dup of really.

[Bug target/115526] regression in 14: invalid assember emitted for alpha, "Error: duplicate !tlsgd!62"

2024-06-17 Thread dilfridge at gentoo dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115526

--- Comment #1 from Andreas K. Huettel  ---
Created attachment 58453
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58453&action=edit
preprocessed source file

[Bug target/115526] regression in 14: invalid assember emitted for alpha, "Error: duplicate !tlsgd!62"

2024-06-17 Thread dilfridge at gentoo dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115526

--- Comment #2 from Andreas K. Huettel  ---
Created attachment 58454
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58454&action=edit
assembler source file

[Bug target/115526] New: regression in 14: invalid assember emitted for alpha, "Error: duplicate !tlsgd!62"

2024-06-17 Thread dilfridge at gentoo dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115526

Bug ID: 115526
   Summary: regression in 14: invalid assember emitted for alpha,
"Error: duplicate !tlsgd!62"
   Product: gcc
   Version: 14.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: dilfridge at gentoo dot org
  Target Milestone: ---

Compiling Python 3.12.4 on Gentoo for alpha fails with the following message:

alpha-unknown-linux-gnu-gcc -c -fno-strict-overflow -Wsign-compare -mieee
-DNDEBUG -mieee -pipe -O2 -mcpu=ev4 -fwrapv -std=c11 -Wextra
-Wno-unused-parameter  
  -Wno-missing-field-initializers -Wstrict-prototypes
-Werror=implicit-function-declaration -fvisibility=hidden  -I./Include/internal
 -I. -I./Include -I/usr/include/ncursesw   
  -fPIC -DPy_BUILD_CORE  -o Python/dtoa.o Python/dtoa.c
{standard input}: Assembler messages:
{standard input}:1773: Error: duplicate !tlsgd!62
make: *** [Makefile:2748: Python/dtoa.o] Error 1

This worked fine with gcc-13 and is now broken with

dilfridge-alpha / # gcc --version
gcc (Gentoo 14.1.1_p20240615 p2) 14.1.1 20240615
[...]
dilfridge-alpha / # as --version
GNU assembler (Gentoo 2.42 p3) 2.42.0

Preprocessed source and assembler source follow.

[Bug tree-optimization/115520] Loop vectorization fails when bool variable instead of unsigned char

2024-06-17 Thread max.sagebaum at scicomp dot uni-kl.de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115520

--- Comment #3 from Max S.  ---
Ok, thanks for the hint with SRA. Know I know how to actually program it. 

I think the main problem is the warning/error message:
> not vectorized: relevant phi not supported: matched_21 = PHI <_20(6), 1(5)>

Is there some way to improve this?

[Bug c/115290] [12 Regression] tree check fail in c_tree_printer, at c/c-objc-common.cc:330

2024-06-17 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115290

Jakub Jelinek  changed:

   What|Removed |Added

Summary|[12/13/14/15 Regression]|[12 Regression] tree check
   |tree check fail in  |fail in c_tree_printer, at
   |c_tree_printer, at  |c/c-objc-common.cc:330
   |c/c-objc-common.cc:330  |

--- Comment #6 from Jakub Jelinek  ---
Should be fixed for 13.4+/14.2+/15.1+ for now, 12 branch is frozen right now,
so it will need to wait for 12.5.

[Bug c/115290] [12/13/14/15 Regression] tree check fail in c_tree_printer, at c/c-objc-common.cc:330

2024-06-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115290

--- Comment #5 from GCC Commits  ---
The releases/gcc-13 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:be14e6cf7f2dc23012dfced0a4aff0894fd6ff57

commit r13-8854-gbe14e6cf7f2dc23012dfced0a4aff0894fd6ff57
Author: Jakub Jelinek 
Date:   Mon Jun 17 19:24:05 2024 +0200

c-family: Fix -Warray-compare warning ICE [PR115290]

The warning code uses %D to print the ARRAY_REF first operands.
That works in the most common case where those operands are decls, but
as can be seen on the following testcase, they can be other expressions
with array type.
Just changing %D to %E isn't enough, because then the diagnostics can
suggest something like
note: use '&(x) != 0 ? (int (*)[32])&a : (int (*)[32])&b[0] == &(y) != 0 ?
(int (*)[32])&a : (int (*)[32])&b[0]' to compare the addresses
which is a bad suggestion, the %E printing doesn't know that the
warning code will want to add & before it and [0] after it.
So, the following patch adds ()s around the operand as well, but does
that only for non-decls, for decls keeps it as &arr[0] like before.

2024-06-17  Jakub Jelinek  

PR c/115290
* c-warn.cc (do_warn_array_compare): Use %E rather than %D for
printing op0 and op1; if those operands aren't decls, also print
parens around them.

* c-c++-common/Warray-compare-3.c: New test.

(cherry picked from commit b63c7d92012f92e0517190cf263d29bbef8a06bf)

[Bug c/115290] [12/13/14/15 Regression] tree check fail in c_tree_printer, at c/c-objc-common.cc:330

2024-06-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115290

--- Comment #4 from GCC Commits  ---
The releases/gcc-14 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:922648759b034c356e7d5c1ae530bdb6f3d00c62

commit r14-10322-g922648759b034c356e7d5c1ae530bdb6f3d00c62
Author: Jakub Jelinek 
Date:   Mon Jun 17 19:24:05 2024 +0200

c-family: Fix -Warray-compare warning ICE [PR115290]

The warning code uses %D to print the ARRAY_REF first operands.
That works in the most common case where those operands are decls, but
as can be seen on the following testcase, they can be other expressions
with array type.
Just changing %D to %E isn't enough, because then the diagnostics can
suggest something like
note: use '&(x) != 0 ? (int (*)[32])&a : (int (*)[32])&b[0] == &(y) != 0 ?
(int (*)[32])&a : (int (*)[32])&b[0]' to compare the addresses
which is a bad suggestion, the %E printing doesn't know that the
warning code will want to add & before it and [0] after it.
So, the following patch adds ()s around the operand as well, but does
that only for non-decls, for decls keeps it as &arr[0] like before.

2024-06-17  Jakub Jelinek  

PR c/115290
* c-warn.cc (do_warn_array_compare): Use %E rather than %D for
printing op0 and op1; if those operands aren't decls, also print
parens around them.

* c-c++-common/Warray-compare-3.c: New test.

(cherry picked from commit b63c7d92012f92e0517190cf263d29bbef8a06bf)

[Bug c/115290] [12/13/14/15 Regression] tree check fail in c_tree_printer, at c/c-objc-common.cc:330

2024-06-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115290

--- Comment #3 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:b63c7d92012f92e0517190cf263d29bbef8a06bf

commit r15-1381-gb63c7d92012f92e0517190cf263d29bbef8a06bf
Author: Jakub Jelinek 
Date:   Mon Jun 17 19:24:05 2024 +0200

c-family: Fix -Warray-compare warning ICE [PR115290]

The warning code uses %D to print the ARRAY_REF first operands.
That works in the most common case where those operands are decls, but
as can be seen on the following testcase, they can be other expressions
with array type.
Just changing %D to %E isn't enough, because then the diagnostics can
suggest something like
note: use '&(x) != 0 ? (int (*)[32])&a : (int (*)[32])&b[0] == &(y) != 0 ?
(int (*)[32])&a : (int (*)[32])&b[0]' to compare the addresses
which is a bad suggestion, the %E printing doesn't know that the
warning code will want to add & before it and [0] after it.
So, the following patch adds ()s around the operand as well, but does
that only for non-decls, for decls keeps it as &arr[0] like before.

2024-06-17  Jakub Jelinek  

PR c/115290
* c-warn.cc (do_warn_array_compare): Use %E rather than %D for
printing op0 and op1; if those operands aren't decls, also print
parens around them.

* c-c++-common/Warray-compare-3.c: New test.

[Bug target/111343] [SH] Including in C++23 causes an ICE with -m4-single-only

2024-06-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111343

--- Comment #6 from GCC Commits  ---
The releases/gcc-13 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:6d0a0c547a6c8425d432129fc90869305fef7bc2

commit r13-8853-g6d0a0c547a6c8425d432129fc90869305fef7bc2
Author: Jakub Jelinek 
Date:   Mon Jun 17 18:53:21 2024 +0200

c++: Fix up floating point conversion rank comparison for _Float32 and
float if float/double are same size [PR115511]

On AVR and SH with some options sizeof (float) == sizeof (double) and
the 2 types have the same set of values.
http://eel.is/c++draft/conv.rank#2.2 for this says that double still
has bigger rank than float and http://eel.is/c++draft/conv.rank#2.2
says that extended type with the same set of values as more than one
standard floating point type shall have the same rank as double.
I've implemented the latter rule as
   if (cnt > 1 && mv2 == long_double_type_node)
 return -2;
with the _Float64/double/long double case having same mode case (various
targets with -mlong-double-64) in mind.
But never thought there are actually targets where float and double
are the same, that needs handling too, if cnt > 1 (that is the extended
type mv1 has same set of values as 2 or 3 of float/double/long double)
and mv2 is float, we need to return 2, because mv1 in that case should
have same rank as double and double has bigger rank than float.

2024-06-17  Jakub Jelinek  

PR target/111343
PR c++/115511
* typeck.cc (cp_compare_floating_point_conversion_ranks): If an
extended floating point type mv1 has same set of values as more
than one standard floating point type and mv2 is float, return 2.

* g++.dg/cpp23/ext-floating18.C: New test.

(cherry picked from commit 8584c98f370cd91647c184ce58141508ca478a12)

[Bug c++/115511] ICE on ambigous overload for _Float32

2024-06-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115511

--- Comment #8 from GCC Commits  ---
The releases/gcc-13 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:6d0a0c547a6c8425d432129fc90869305fef7bc2

commit r13-8853-g6d0a0c547a6c8425d432129fc90869305fef7bc2
Author: Jakub Jelinek 
Date:   Mon Jun 17 18:53:21 2024 +0200

c++: Fix up floating point conversion rank comparison for _Float32 and
float if float/double are same size [PR115511]

On AVR and SH with some options sizeof (float) == sizeof (double) and
the 2 types have the same set of values.
http://eel.is/c++draft/conv.rank#2.2 for this says that double still
has bigger rank than float and http://eel.is/c++draft/conv.rank#2.2
says that extended type with the same set of values as more than one
standard floating point type shall have the same rank as double.
I've implemented the latter rule as
   if (cnt > 1 && mv2 == long_double_type_node)
 return -2;
with the _Float64/double/long double case having same mode case (various
targets with -mlong-double-64) in mind.
But never thought there are actually targets where float and double
are the same, that needs handling too, if cnt > 1 (that is the extended
type mv1 has same set of values as 2 or 3 of float/double/long double)
and mv2 is float, we need to return 2, because mv1 in that case should
have same rank as double and double has bigger rank than float.

2024-06-17  Jakub Jelinek  

PR target/111343
PR c++/115511
* typeck.cc (cp_compare_floating_point_conversion_ranks): If an
extended floating point type mv1 has same set of values as more
than one standard floating point type and mv2 is float, return 2.

* g++.dg/cpp23/ext-floating18.C: New test.

(cherry picked from commit 8584c98f370cd91647c184ce58141508ca478a12)

[Bug c++/115511] ICE on ambigous overload for _Float32

2024-06-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115511

--- Comment #7 from GCC Commits  ---
The releases/gcc-14 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:5be6d9d2a9854c05f3c019deb9fe95eca7248140

commit r14-10321-g5be6d9d2a9854c05f3c019deb9fe95eca7248140
Author: Jakub Jelinek 
Date:   Mon Jun 17 18:53:21 2024 +0200

c++: Fix up floating point conversion rank comparison for _Float32 and
float if float/double are same size [PR115511]

On AVR and SH with some options sizeof (float) == sizeof (double) and
the 2 types have the same set of values.
http://eel.is/c++draft/conv.rank#2.2 for this says that double still
has bigger rank than float and http://eel.is/c++draft/conv.rank#2.2
says that extended type with the same set of values as more than one
standard floating point type shall have the same rank as double.
I've implemented the latter rule as
   if (cnt > 1 && mv2 == long_double_type_node)
 return -2;
with the _Float64/double/long double case having same mode case (various
targets with -mlong-double-64) in mind.
But never thought there are actually targets where float and double
are the same, that needs handling too, if cnt > 1 (that is the extended
type mv1 has same set of values as 2 or 3 of float/double/long double)
and mv2 is float, we need to return 2, because mv1 in that case should
have same rank as double and double has bigger rank than float.

2024-06-17  Jakub Jelinek  

PR target/111343
PR c++/115511
* typeck.cc (cp_compare_floating_point_conversion_ranks): If an
extended floating point type mv1 has same set of values as more
than one standard floating point type and mv2 is float, return 2.

* g++.dg/cpp23/ext-floating18.C: New test.

(cherry picked from commit 8584c98f370cd91647c184ce58141508ca478a12)

[Bug target/111343] [SH] Including in C++23 causes an ICE with -m4-single-only

2024-06-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111343

--- Comment #5 from GCC Commits  ---
The releases/gcc-14 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:5be6d9d2a9854c05f3c019deb9fe95eca7248140

commit r14-10321-g5be6d9d2a9854c05f3c019deb9fe95eca7248140
Author: Jakub Jelinek 
Date:   Mon Jun 17 18:53:21 2024 +0200

c++: Fix up floating point conversion rank comparison for _Float32 and
float if float/double are same size [PR115511]

On AVR and SH with some options sizeof (float) == sizeof (double) and
the 2 types have the same set of values.
http://eel.is/c++draft/conv.rank#2.2 for this says that double still
has bigger rank than float and http://eel.is/c++draft/conv.rank#2.2
says that extended type with the same set of values as more than one
standard floating point type shall have the same rank as double.
I've implemented the latter rule as
   if (cnt > 1 && mv2 == long_double_type_node)
 return -2;
with the _Float64/double/long double case having same mode case (various
targets with -mlong-double-64) in mind.
But never thought there are actually targets where float and double
are the same, that needs handling too, if cnt > 1 (that is the extended
type mv1 has same set of values as 2 or 3 of float/double/long double)
and mv2 is float, we need to return 2, because mv1 in that case should
have same rank as double and double has bigger rank than float.

2024-06-17  Jakub Jelinek  

PR target/111343
PR c++/115511
* typeck.cc (cp_compare_floating_point_conversion_ranks): If an
extended floating point type mv1 has same set of values as more
than one standard floating point type and mv2 is float, return 2.

* g++.dg/cpp23/ext-floating18.C: New test.

(cherry picked from commit 8584c98f370cd91647c184ce58141508ca478a12)

[Bug middle-end/47081] Macro usage too clever for localization

2024-06-17 Thread jsm28 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47081

--- Comment #7 from Joseph S. Myers  ---
There no longer seems to be a function "fatal" with a parameter using the msgid
naming convention that could be confused with the (non-translating) function
called here, which explains why these messages were (correctly) not extracted
for translation, but were at the time of this bug report. It seems such a
function was removed from collect2 in commit
9e350e99cb9f93ea99216c9c2a40517111636116 (May 2011) and from lto-wrapper in
commit ffb1f5ef12ce9d9994e850d87cbe4116a69d8d90 (June 2014).

[Bug rtl-optimization/115523] [avr] Remove SFmode insns

2024-06-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115523

--- Comment #5 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #4)
> I am not 100% sure if this is the latest version of the patch set though,
> but it was posted to the gcc-patches@ list in late May:
> https://inbox.sourceware.org/gcc-patches/20240512225738.528917-2-juzhe.
> zh...@rivai.ai/

Oh this is just the analysis part, the actually code changing part has not been
posted yet:
> And we will enable register coalsece with subreg liveness tracking in the 
> followup patches.

[Bug c++/115511] ICE on ambigous overload for _Float32

2024-06-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115511

--- Comment #6 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:8584c98f370cd91647c184ce58141508ca478a12

commit r15-1380-g8584c98f370cd91647c184ce58141508ca478a12
Author: Jakub Jelinek 
Date:   Mon Jun 17 18:53:21 2024 +0200

c++: Fix up floating point conversion rank comparison for _Float32 and
float if float/double are same size [PR115511]

On AVR and SH with some options sizeof (float) == sizeof (double) and
the 2 types have the same set of values.
http://eel.is/c++draft/conv.rank#2.2 for this says that double still
has bigger rank than float and http://eel.is/c++draft/conv.rank#2.2
says that extended type with the same set of values as more than one
standard floating point type shall have the same rank as double.
I've implemented the latter rule as
   if (cnt > 1 && mv2 == long_double_type_node)
 return -2;
with the _Float64/double/long double case having same mode case (various
targets with -mlong-double-64) in mind.
But never thought there are actually targets where float and double
are the same, that needs handling too, if cnt > 1 (that is the extended
type mv1 has same set of values as 2 or 3 of float/double/long double)
and mv2 is float, we need to return 2, because mv1 in that case should
have same rank as double and double has bigger rank than float.

2024-06-17  Jakub Jelinek  

PR target/111343
PR c++/115511
* typeck.cc (cp_compare_floating_point_conversion_ranks): If an
extended floating point type mv1 has same set of values as more
than one standard floating point type and mv2 is float, return 2.

* g++.dg/cpp23/ext-floating18.C: New test.

[Bug target/111343] [SH] Including in C++23 causes an ICE with -m4-single-only

2024-06-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111343

--- Comment #4 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:8584c98f370cd91647c184ce58141508ca478a12

commit r15-1380-g8584c98f370cd91647c184ce58141508ca478a12
Author: Jakub Jelinek 
Date:   Mon Jun 17 18:53:21 2024 +0200

c++: Fix up floating point conversion rank comparison for _Float32 and
float if float/double are same size [PR115511]

On AVR and SH with some options sizeof (float) == sizeof (double) and
the 2 types have the same set of values.
http://eel.is/c++draft/conv.rank#2.2 for this says that double still
has bigger rank than float and http://eel.is/c++draft/conv.rank#2.2
says that extended type with the same set of values as more than one
standard floating point type shall have the same rank as double.
I've implemented the latter rule as
   if (cnt > 1 && mv2 == long_double_type_node)
 return -2;
with the _Float64/double/long double case having same mode case (various
targets with -mlong-double-64) in mind.
But never thought there are actually targets where float and double
are the same, that needs handling too, if cnt > 1 (that is the extended
type mv1 has same set of values as 2 or 3 of float/double/long double)
and mv2 is float, we need to return 2, because mv1 in that case should
have same rank as double and double has bigger rank than float.

2024-06-17  Jakub Jelinek  

PR target/111343
PR c++/115511
* typeck.cc (cp_compare_floating_point_conversion_ranks): If an
extended floating point type mv1 has same set of values as more
than one standard floating point type and mv2 is float, return 2.

* g++.dg/cpp23/ext-floating18.C: New test.

[Bug rtl-optimization/115523] [avr] Remove SFmode insns

2024-06-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115523

--- Comment #4 from Andrew Pinski  ---
(In reply to Georg-Johann Lay from comment #3)
> Andrew, in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114243#c2 you wrote
> that the issue is going to be fixed for RISC-V. You have a PR for that, and
> what's the state of it?

I am not 100% sure if this is the latest version of the patch set though, but
it was posted to the gcc-patches@ list in late May:
https://inbox.sourceware.org/gcc-patches/20240512225738.528917-2-juzhe.zh...@rivai.ai/

[Bug rtl-optimization/115523] [avr] Remove SFmode insns

2024-06-17 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115523

--- Comment #3 from Georg-Johann Lay  ---
Andrew, in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114243#c2 you wrote
that the issue is going to be fixed for RISC-V. You have a PR for that, and
what's the state of it?

[Bug c/86869] ICE when taking address of array member of __memx struct pointer

2024-06-17 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86869

Georg-Johann Lay  changed:

   What|Removed |Added

   Target Milestone|14.0|14.1

[Bug pch/115312] [14/15 Regression] ICE when including a PCH via compiler option -include since r14-5836

2024-06-17 Thread lhyatt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115312

Lewis Hyatt  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2024-06-17
Summary|[14/15 Regression] ICE when |[14/15 Regression] ICE when
   |including a PCH via |including a PCH via
   |compiler option -include|compiler option -include
   ||since r14-5836

--- Comment #7 from Lewis Hyatt  ---
OK I see now. For the case of a PCH combined with -include it is a regression
in GCC 14 caused by r14-5836. I think any platform that does not make use of
the stdc-predef.h preinclude will be affected (including MinGW). 

The assert added in r14-5836 should not be failing, but it is failing due to an
oversight in r14-2893. This was a new feature for GCC 14 that made #pragma work
during preprocessing, which required the creation of a parser object in
preprocess-only mode.

I will test this patch that should fix it:

=

diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
index faaf9ee6350..a09a0518c52 100644
--- a/gcc/c-family/c-opts.cc
+++ b/gcc/c-family/c-opts.cc
@@ -1296,8 +1296,8 @@ c_common_init (void)

   if (flag_preprocess_only)
 {
-  c_finish_options ();
   c_init_preprocess ();
+  c_finish_options ();
   preprocess_file (parse_in);
   return false;
 }

=

The problem is that c_finish_options() will also include the first
command-line- specified include file. On glibc platforms, this is always the
stdc-predef.h preinclude and then things work. In the absence of a preinclude,
it will include the first file requested with -include. If that file triggers a
PCH load, then we hit the code path introduced in r14-5836; after the PCH is
loaded, we reinitialize the parser. But in this case we have not called
c_init_preprocess() yet, so when we do call it afterward, the parser already
exists and so the assert fails. 

The above patch should make things right, I am testing it now and it should get
into GCC 14.2. You could add it to the list of MinGW-specific patches for
WinLibs and MSYS2 in the meantime too. It's preferable to commenting out the
assert, although that workaround probably does work fine at the moment as well.

[Bug target/115408] regression between gcc 13.3.0 and 14.1.0 using -mips16 and -minterlink-mips16

2024-06-17 Thread broly at mac dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115408

--- Comment #9 from gagan sidhu (broly)  ---
Created attachment 58452
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58452&action=edit
curl failure

this is not gd guysss :

[Bug tree-optimization/115489] [12/13/14/15 regression] ICE: tree check: expected class 'type', have 'exceptional' (error_mark) in create_tmp_from_val, at gimplify.cc:589 since r12-3278-g823685221de98

2024-06-17 Thread roger at nextmovesoftware dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115489

Roger Sayle  changed:

   What|Removed |Added

  Component|c   |tree-optimization

--- Comment #3 from Roger Sayle  ---
The call to gimplify_expr on line 651 of gimplify.cc (in internal_get_tmp_var)
doesn't check its return value, which for this test case is GS_ERROR.
It looks like internal_get_tmp_var's callers don't expect that it could ever
fail, so it's unclear whether returning NULL_RTX or error_mark_node would be
safe.
One approach might be to record the original type of val (which gets turned
into error_mark_node by the failing gimplify_expr call), then use this type to
call make_ssa_name if things have gone wrong [an uninitialized SSA name may
cause fewer downstream issues than an error_mark_node].

[Bug libstdc++/115522] [13/14/15 Regression] std::to_array no longer works for struct which is trivial but not default constructible

2024-06-17 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115522

--- Comment #3 from Jonathan Wakely  ---
That shouldn't be needed, because a trivial class has to have a trivial default
constructor.

The problem is that we default-initialize the array, which requires the const
members to be initialized.

We also need to check for assignability, because trivial classes can have
deleted assignment operators (for some reason).


--- a/libstdc++-v3/include/std/array
+++ b/libstdc++-v3/include/std/array
@@ -434,7 +434,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   static_assert(is_constructible_v<_Tp, _Tp&>);
   if constexpr (is_constructible_v<_Tp, _Tp&>)
{
- if constexpr (is_trivial_v<_Tp>)
+ if constexpr (is_trivial_v<_Tp> && is_default_constructible_v<_Tp>
+  && is_copy_assignable_v<_Tp>)
{
  array, _Nm> __arr;
  if (!__is_constant_evaluated() && _Nm != 0)
@@ -463,7 +464,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   static_assert(is_move_constructible_v<_Tp>);
   if constexpr (is_move_constructible_v<_Tp>)
{
- if constexpr (is_trivial_v<_Tp>)
+ if constexpr (is_trivial_v<_Tp> && is_default_constructible_v<_Tp>
+  && is_move_assignable_v<_Tp>)
{
  array, _Nm> __arr;
  if (!__is_constant_evaluated() && _Nm != 0)

[Bug target/115523] [avr] Remove SFmode insns

2024-06-17 Thread avr at gjlay dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115523

--- Comment #2 from Georg-Johann Lay  ---
Am 17.06.24 um 17:06 schrieb pinskia at gcc dot gnu.org:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115523
> 
> Andrew Pinski  changed:
> 
> What|Removed |Added
> 
>   CC||pinskia at gcc dot gnu.org
>Component|rtl-optimization|target
> 

This is an issue in the RTL passes, subreg-lowering vs. reg alloc.

It's just the case that the backend can (partially) hack around it.

Johann

[Bug libstdc++/37475] codecvt::do_in/do_out functions return "ok" when the output sequence has zero length

2024-06-17 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=37475

--- Comment #15 from Jonathan Wakely  ---
Oops, that's definitely not intended! Good catch, thanks.

[Bug target/114189] Target implements obsolete vcond{,u,eq} expanders

2024-06-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114189

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2024-06-17
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

--- Comment #5 from Andrew Pinski  ---
.

[Bug target/115518] aarch64: Poor codegen for arm_neon_sve_bridge.h

2024-06-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115518

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2024-06-17
   Severity|normal  |enhancement
 Ever confirmed|0   |1
   Keywords||missed-optimization
 CC||pinskia at gcc dot gnu.org

--- Comment #1 from Andrew Pinski  ---
Confirmed.

[Bug rtl-optimization/115521] ICE at -O1 with "-fno-tree-ccp -fno-tree-dominator-opts" on x86_64-linux-gnu: in extract_constrain_insn, at recog.cc:2713

2024-06-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115521

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|DUPLICATE   |---
 Status|RESOLVED|NEW

--- Comment #5 from Andrew Pinski  ---
(In reply to Zhendong Su from comment #4)
> (In reply to Andrew Pinski from comment #3)
> > Dup then.
> > 
> > *** This bug has been marked as a duplicate of bug 114942 ***
> 
> Similar, but perhaps not a dup as PR114942 doesn't reproduce on the trunk
> anymore.

Oh there was a change on the trunk which partly solved the issue but it looks
like not all of the issue with the RA.

[Bug tree-optimization/115520] Loop vectorization fails when bool variable instead of unsigned char

2024-06-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115520

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2024-06-17
   Severity|normal  |enhancement
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #2 from Andrew Pinski  ---
Note the reason why the struct case works is due to SRA "changing" bool type to
unsigned char.

[Bug tree-optimization/115520] Loop vectorization fails when bool variable instead of unsigned char

2024-06-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115520

Andrew Pinski  changed:

   What|Removed |Added

Summary|Loop vectorization fails|Loop vectorization fails
   |when not using a struct |when bool variable instead
   |sometimes   |of unsigned char
 CC||pinskia at gcc dot gnu.org

--- Comment #1 from Andrew Pinski  ---
I have seen this one recorded too.

[Bug rtl-optimization/115521] ICE at -O1 with "-fno-tree-ccp -fno-tree-dominator-opts" on x86_64-linux-gnu: in extract_constrain_insn, at recog.cc:2713

2024-06-17 Thread zhendong.su at inf dot ethz.ch via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115521

--- Comment #4 from Zhendong Su  ---
(In reply to Andrew Pinski from comment #3)
> Dup then.
> 
> *** This bug has been marked as a duplicate of bug 114942 ***

Similar, but perhaps not a dup as PR114942 doesn't reproduce on the trunk
anymore.

[Bug target/114942] [14/15 Regression] ICE on valid code at -O1 with "-fno-tree-sra -fno-guess-branch-probability": in extract_constrain_insn, at recog.cc:2713

2024-06-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114942

--- Comment #7 from Andrew Pinski  ---
*** Bug 115521 has been marked as a duplicate of this bug. ***

[Bug rtl-optimization/115521] ICE at -O1 with "-fno-tree-ccp -fno-tree-dominator-opts" on x86_64-linux-gnu: in extract_constrain_insn, at recog.cc:2713

2024-06-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115521

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|NEW |RESOLVED

--- Comment #3 from Andrew Pinski  ---
Dup then.

*** This bug has been marked as a duplicate of bug 114942 ***

[Bug libstdc++/115522] [13/14/15 Regression] std::to_array no longer works for struct which is trivial but not default constructible

2024-06-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115522

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=110167

--- Comment #2 from Andrew Pinski  ---
Looks like the check (that was added in r13-8421-g4c6bb36e88d5c8 /
r14-1647-g960de5dd886572711ef86fa1e15e30d3810eccb9 ),  constexpr
(is_trivial_v<_Tp>) should be added on to, something like:
constexpr (is_trivial_v<_Tp> && is_default_constructible_v<_Tp>)


Or maybe more, I am not 100% sure.

[Bug libstdc++/115522] [13/14/15 Regression] std::to_array no longer works for struct which is trivial but not default constructible

2024-06-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115522

Andrew Pinski  changed:

   What|Removed |Added

 CC||pinskia at gcc dot gnu.org
Summary|std::to_array no longer |[13/14/15 Regression]
   |works for struct which is   |std::to_array no longer
   |trivial but not default |works for struct which is
   |constructible   |trivial but not default
   ||constructible
   Keywords||rejects-valid
 Status|UNCONFIRMED |NEW
  Known to work||12.3.0, 13.2.0
 Ever confirmed|0   |1
  Known to fail||13.3.0, 14.1.0
   Target Milestone|--- |13.4
   Last reconfirmed||2024-06-17

--- Comment #1 from Andrew Pinski  ---
Confirmed.

[Bug fortran/107294] Missed optimization: multiplying real with complex number in Fortran (only)

2024-06-17 Thread kargls at comcast dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107294

--- Comment #6 from kargls at comcast dot net ---
(In reply to mjr19 from comment #5)
> But 10.1.5.2.4 says "once the interpretation of a numeric intrinsic
> operation is established, the processor may evaluate any mathematically
> equivalent expression, provided that the integrity of parentheses is not
> violated."
> 
> As cmplx(r)*z and cmplx(r*real(z),r*aimag(z)) are mathematically equivalent,
> is a Fortran compiler not permitted to make this optimisation (unless
> conforming to F90 or F95, for 10.1.5.2.4 was first introduced in F2003)?
> 
> Furthermore, Fortran does not define how complex multiplication is to be
> performed, so relying on the precise details result when multiplying signed
> zeros or NaNs may be unwise.

See comment #3.  Type conversion occurs.

If you want to get a possibly wrong answer fast, use either -Ofast or
-ffast-math.

[Bug c++/99678] [concepts] requires-clause allows undeclared identifier

2024-06-17 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99678

Patrick Palka  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from Patrick Palka  ---
Fixed for GCC 14.2, thanks for the bug report.

[Bug c++/115239] [14 Regression] ICE: Segmentation fault with ambiguous function call in some cases (`const char*` vs `char` with `long` vs `unsigned`) since r14-6522

2024-06-17 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115239

Patrick Palka  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #7 from Patrick Palka  ---
Fixed for GCC 14.2, thanks for the bug report.

[Bug c++/115283] [14 Regression] "used but never defined" with extern templates

2024-06-17 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115283

Patrick Palka  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from Patrick Palka  ---
Fixed for GCC 14.2, thanks for the bug report.

[Bug c++/99678] [concepts] requires-clause allows undeclared identifier

2024-06-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99678

--- Comment #5 from GCC Commits  ---
The releases/gcc-14 branch has been updated by Patrick Palka
:

https://gcc.gnu.org/g:20cda2e85c307096a3856f7f27215b8a28982fb6

commit r14-10320-g20cda2e85c307096a3856f7f27215b8a28982fb6
Author: Patrick Palka 
Date:   Thu Jun 13 10:16:10 2024 -0400

c++: undeclared identifier in requires-clause [PR99678]

Since the terms of a requires-clause are grammatically primary-expressions
and not e.g. postfix-expressions, it seems we need to explicitly handle
and diagnose the case where a term parses to a bare unresolved identifier,
like cp_parser_postfix_expression does, since cp_parser_primary_expression
leaves that up to its callers.  Otherwise we incorrectly accept the first
three requires-clauses below.

Note that the only occurrences of primary-expression in the grammar are
postfix-expression and constraint-logical-and-expression, so it's not too
surprising that we need this special handling here.

PR c++/99678

gcc/cp/ChangeLog:

* parser.cc (cp_parser_constraint_primary_expression): Diagnose
a bare unresolved unqualified-id.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-requires38.C: New test.

Reviewed-by: Jason Merrill 
(cherry picked from commit d387ecb2b2f44f33fd6a7c5ec7eadaf6dd70efc9)

[Bug c++/115239] [14 Regression] ICE: Segmentation fault with ambiguous function call in some cases (`const char*` vs `char` with `long` vs `unsigned`) since r14-6522

2024-06-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115239

--- Comment #6 from GCC Commits  ---
The releases/gcc-14 branch has been updated by Patrick Palka
:

https://gcc.gnu.org/g:4df86402990e2f45e02a367f1734a22ebc041e98

commit r14-10319-g4df86402990e2f45e02a367f1734a22ebc041e98
Author: Patrick Palka 
Date:   Thu Jun 13 10:02:43 2024 -0400

c++: ICE w/ ambig and non-strictly-viable cands [PR115239]

Here during overload resolution we have two strictly viable ambiguous
candidates #1 and #2, and two non-strictly viable candidates #3 and #4
which we hold on to ever since r14-6522.  These latter candidates have
an empty second arg conversion since the first arg conversion was deemed
bad, and this trips up joust when called on #3 and #4 which assumes all
arg conversions are there.

We can fix this by making joust robust to empty arg conversions, but in
this situation we shouldn't need to compare #3 and #4 at all given that
we have a strictly viable candidate.  To that end, this patch makes
tourney shortcut considering non-strictly viable candidates upon
encountering ambiguity between two strictly viable candidates (taking
advantage of the fact that the candidates list is sorted according to
viability via splice_viable).

PR c++/115239

gcc/cp/ChangeLog:

* call.cc (tourney): Don't consider a non-strictly viable
candidate as the champ if there was ambiguity between two
strictly viable candidates.

gcc/testsuite/ChangeLog:

* g++.dg/overload/error7.C: New test.

Reviewed-by: Jason Merrill 
(cherry picked from commit 7fed7e9bbc57d502e141e079a6be2706bdbd4560)

[Bug c++/115283] [14 Regression] "used but never defined" with extern templates

2024-06-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115283

--- Comment #5 from GCC Commits  ---
The releases/gcc-14 branch has been updated by Patrick Palka
:

https://gcc.gnu.org/g:9583f781e17d4da881ee64db43af939402331413

commit r14-10318-g9583f781e17d4da881ee64db43af939402331413
Author: Patrick Palka 
Date:   Wed Jun 12 20:05:05 2024 -0400

c++: visibility wrt concept-id as targ [PR115283]

Like with alias templates, it seems we don't maintain visibility flags
for concepts either, so min_vis_expr_r should ignore them for now.
Otherwise after r14-6789 we may incorrectly give a function template that
uses a concept-id in its signature internal linkage.

PR c++/115283

gcc/cp/ChangeLog:

* decl2.cc (min_vis_expr_r) : Ignore
concepts.

gcc/testsuite/ChangeLog:

* g++.dg/template/linkage5.C: New test.

Reviewed-by: Jason Merrill 
(cherry picked from commit b1fe718cbe0c8883af89f52e0aad3ebf913683de)

[Bug other/115525] New: Documentation: "sentinel" attribute should suggest "nullptr" instead of NULL

2024-06-17 Thread Explorer09 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115525

Bug ID: 115525
   Summary: Documentation: "sentinel" attribute should suggest
"nullptr" instead of NULL
   Product: gcc
   Version: 14.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Explorer09 at gmail dot com
  Target Milestone: ---

"sentinel" function attribute in GCC documentation:
https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-sentinel-function-attribute

The section mentioned `NULL` for the null pointer. But for C and C++ this is a
bad idea (read this as for why: https://ewontfix.com/11/)

We are now in the era where "nullptr" is standardized in both C and C++, and so
the documentation should suggest that instead, whenever possible. (For pre-C23
code, `(void *)0` may be used as an alternative to `nullptr`, but `NULL` (the
constant) should be deprecated for use in function sentinels now.)

I suggest changing the paragraph to the following:

```
This function attribute indicates that a call to the function is expected to
have a null pointer as the sentinel. The attribute is only valid on variadic
functions. If the optional position argument is specified to the
attribute, the function expects the sentinel to be present at position
counting backwards from the end of the argument list. If position is
unspecified or 0, the sentinel is expected to be the last argument of the
function call, that is,

  __attribute__ ((sentinel))
  is equivalent to
  __attribute__ ((sentinel(0)))

The attribute is automatically set with a position of 0 for the built-in
functions execl and execlp. The built-in function
execle has the attribute set with a position of 1.

Starting with C++11 and C23, nullptr is the preferred keyword for
passing a null pointer argument. For code that need to support older standards,
a zero value with a pointer cast may be used as an alternative.

The warnings for missing or incorrect sentinels are enabled with
-Wformat.

```

[Bug c++/115524] New: Cygwin: Space character categorized as non-printable by std::ctype

2024-06-17 Thread kristian.spangsege at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115524

Bug ID: 115524
   Summary: Cygwin: Space character categorized as non-printable
by std::ctype
   Product: gcc
   Version: 13.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kristian.spangsege at gmail dot com
  Target Milestone: ---

Created attachment 58451
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58451&action=edit
test.cpp Demonstrate the bug

On Cygwin, a space character is reported as non-printable by
`std::ctype` in the "C" locale (the only locale supported by
libstdc+++ on Cygwin). In other words, the following expression evaluates to
`false`:

ctype.is(std::ctype_base::print, ctype.widen(' '))

where `ctype` is obtained by `std::use_facet>(loc)` and
`loc` is the "C" locale.

It should have been evaluated to `true` because a space character is required
by the C++ standard to be categorized as printable (the defining distinction
between a printable and a graphical character is that space is printable but
not graphical).

Also, the space character is reported as printable by `std::iswprint(int ch)`.
So it seems like the problem is specific to `libstdc++`.

Also, the non-wide space character is reported as printable, i.e,
`std::use_facet>(loc)::is(std::ctype_base::print, ' ')` is
`true`.

Also, `ctype.is(std::ctype_base::print, ctype.widen(' '))` is `true` with
MinGW, with GCC on Linux, and with Visual Studio.

The problem can be demonstrated with the code below (attached as `test.cpp`) in
regular Cygwin (https://cygwin.com/install.html) using GCC 12.3.1 and in the
Cygwin environment of MSYS2 (https://www.msys2.org/) using GCC 13.2.0:

#include 
#include 
#include 
#include 

int main()
{
using facet_type = std::ctype;
std::locale loc;
const auto& ctype = std::use_facet(loc);
wchar_t ch = ctype.widen(' ');
std::cout << ctype.is(ctype.print, ch) << "\n";
std::cout << (std::iswprint(std::char_traits::to_int_type(ch)) !=
0) << "\n";
}

[Bug target/115493] [15 regression] gcc.c-torture/execute/pr94734.c fails on MVE after r15-1054-g202a9c8fe7d

2024-06-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115493

--- Comment #7 from Richard Biener  ---
Created attachment 58450
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58450&action=edit
patch I am testing

I'm testing this (visually confirmed it fixes the observed issue).

[Bug target/115493] [15 regression] gcc.c-torture/execute/pr94734.c fails on MVE after r15-1054-g202a9c8fe7d

2024-06-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115493

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |15.0
   Priority|P3  |P1

[Bug target/115500] RISC-V: Performance regression on 1bit test

2024-06-17 Thread law at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115500

--- Comment #7 from Jeffrey A. Law  ---
And to be clearer, if you look at the two assembly snippets:

The problem is about
   0:   814dsrlia0,a0,0x13
   2:   8905andia0,a0,1
   4:   e501bneza0,c <.L3>
vs 
   0:   02c51793sllia5,a0,0x2c
   4:   0007c563bltza5,e <.L3>



They're both using the same basic idioms (logical shifts and simple conditional
branch), one just has an extra andi.   The second one has a smaller data
dependency critical path.  So it's hard to see how the first would ever be
better.

More likely than not what's going on here is going to be something highly
specific to the micro-architecture implementation of whatever chip you tested. 
So for example, some uarchs are particularly sensitive to code alignments. 
That could effect the little loop or the function call.

To put this in perspective, I'm aware of a uarch that would show a double-digit
performance delta due to a 2 instruction, 6 byte sequence moving across a
particular boundary -- in a real world benchmark that executes nearly a
trillion instructions.

Point is you have to be *very* careful analyzing this stuff and sometimes
things can be very surprising.

So probably the next question is what did you use to test this and what do we
know about its uarch and can we correlate what is public about that uarch to
the behavior your seeing.

[Bug target/115493] [15 regression] gcc.c-torture/execute/pr94734.c fails on MVE after r15-1054-g202a9c8fe7d

2024-06-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115493

--- Comment #6 from Richard Biener  ---
The cruical difference is:

-  _60 = BIT_FIELD_REF ;
-  _61 = BIT_FIELD_REF ;
-  _62 = MAX_EXPR <_61, _60>;
-  _63 = BIT_FIELD_REF ;
-  _64 = MAX_EXPR <_63, _62>;
-  _65 = BIT_FIELD_REF ;
-  _66 = MAX_EXPR <_65, _64>;
-  _67 = _66 == -1;
-  stmp_cstore_15.38_68 = _67 ? arr__I_lsm.26_4 : _66;
+  _58 = BIT_FIELD_REF ;
+  _59 = BIT_FIELD_REF ;
+  _60 = BIT_FIELD_REF ;
+  _61 = BIT_FIELD_REF ;
+  _62 = MAX_EXPR <_58, _59>;
+  _63 = MAX_EXPR <_62, _60>;
+  _64 = MAX_EXPR <_63, _61>;
+  _65 = _61 == -1;
+  stmp_cstore_15.38_66 = _65 ? arr__I_lsm.26_4 : _61;

note how the compare against -1 uses the vector component rather the MAX
reduced value.

[Bug c/115290] [12/13/14/15 Regression] tree check fail in c_tree_printer, at c/c-objc-common.cc:330

2024-06-17 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115290

Jakub Jelinek  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org
 Status|NEW |ASSIGNED

--- Comment #2 from Jakub Jelinek  ---
Created attachment 58449
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58449&action=edit
gcc15-pr115290.patch

Untested fix.

  1   2   >