from:"cvs\-commit at gcc dot gnu.org"

[Bug tree-optimization/115629] Inefficient if-convert of masked conditionals

2024-06-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115629

--- Comment #2 from GCC Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:629257bcb81434117f1e9c68479032563176dc0c

commit r15-1662-g629257bcb81434117f1e9c68479032563176dc0c
Author: Richard Biener 
Date:   Tue Jun 25 14:04:31 2024 +0200

tree-optimization/115629 - missed tail merging

The following fixes a missed tail-merging observed for the testcase
in PR115629.  The issue is that when deps_ok_for_redirect doesn't
compute both would be valid prevailing blocks it rejects the merge.
The following instead makes sure to record the working block as
prevailing.  Also stmt comparison fails for indirect references
and is not handling memory references thoroughly, failing to unify
array indices and pointers indirected.  The following attempts to
fix this.

PR tree-optimization/115629
* tree-ssa-tail-merge.cc (gimple_equal_p): Handle
memory references better.
(deps_ok_for_redirect): Handle the case not both blocks
are considered a valid prevailing block.

* gcc.dg/tree-ssa/tail-merge-1.c: New testcase.

[Bug tree-optimization/115652] [15 Regression] GCN: FAIL: gcc.dg/vect/pr70138-{1,2}.c (internal compiler error: verify_ssa failed)

2024-06-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115652

--- Comment #2 from GCC Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:f80db5495d5f8455b3003951727eb6c8dc67d81d

commit r15-1653-gf80db5495d5f8455b3003951727eb6c8dc67d81d
Author: Richard Biener 
Date:   Wed Jun 26 09:25:27 2024 +0200

tree-optimization/115652 - adjust insertion gsi for SLP

The following adjusts how SLP computes the insertion location.  In
particular it advanced the insert iterator of the found last_stmt.
The vectorizer will later insert stmts _before_ it.  But we also
have the constraint that possibly masked ops may not be scheduled
outside of the loop and as we do not model the loop mask in the
SLP graph we have to adjust for that.  The following moves this
to after the advance since it isn't compatible with that as the
current GIMPLE_COND exception shows.  The PR is about in-order
reduction vectorization which also isn't happy when that's the
very first stmt.

PR tree-optimization/115652
* tree-vect-slp.cc (vect_schedule_slp_node): Advance the
iterator based on last_stmt only for vector defs.

[Bug target/106069] [12/13/14/15 Regression] wrong code with -O -fno-tree-forwprop -maltivec on ppc64le

2024-06-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069

--- Comment #41 from GCC Commits  ---
The master branch has been updated by Kewen Lin :

https://gcc.gnu.org/g:62520e4e9f7e2fe8a16ee57a4bd35da2e921ae22

commit r15-1644-g62520e4e9f7e2fe8a16ee57a4bd35da2e921ae22
Author: Kewen Lin 
Date:   Wed Jun 26 02:16:17 2024 -0500

rs6000: Fix wrong RTL patterns for vector merge high/low char on LE

Commit r12-4496 changes some define_expands and define_insns
for vector merge high/low char, which are altivec_vmrg[hl]b.
These defines are mainly for built-in function vec_merge{h,l}
and some internal gen function needs.  These functions should
consider endianness, taking vec_mergeh as example, as PVIPR
defines, vec_mergeh "Merges the first halves (in element order)
of two vectors", it does note it's in element order.  So it's
mapped into vmrghb on BE while vmrglb on LE respectively.
Although the mapped insns are different, as the discussion in
PR106069, the RTL pattern should be still the same, it is
conformed before commit r12-4496, but gets changed into
different patterns on BE and LE starting from commit r12-4496.
Similar to 32-bit element case in commit log of r15-1504, this
8-bit element pattern on LE doesn't actually match what the
underlying insn is intended to represent, once some optimization
like combine does some changes basing on it, it would cause
the unexpected consequence.  The newly constructed test case
pr106069-1.c is a typical example for this issue.

So this patch is to fix the wrong RTL pattern, ensure the
associated RTL patterns become the same as before which can
have the same semantic as their mapped insns.  With the
proposed patch, the expanders like altivec_vmrghb expands
into altivec_vmrghb_direct_be or altivec_vmrglb_direct_le
depending on endianness, "direct" can easily show which
insn would be generated, _be and _le are mainly for the
different RTL patterns as endianness.

Co-authored-by: Xionghu Luo 

PR target/106069
PR target/115355

gcc/ChangeLog:

* config/rs6000/altivec.md (altivec_vmrghb_direct): Rename to ...
(altivec_vmrghb_direct_be): ... this.  Add condition
BYTES_BIG_ENDIAN.
(altivec_vmrghb_direct_le): New define_insn.
(altivec_vmrglb_direct): Rename to ...
(altivec_vmrglb_direct_be): ... this.  Add condition
BYTES_BIG_ENDIAN.
(altivec_vmrglb_direct_le): New define_insn.
(altivec_vmrghb): Adjust by calling gen_altivec_vmrghb_direct_be
for BE and gen_altivec_vmrglb_direct_le for LE.
(altivec_vmrglb): Adjust by calling gen_altivec_vmrglb_direct_be
for BE and gen_altivec_vmrghb_direct_le for LE.
* config/rs6000/rs6000.cc (altivec_expand_vec_perm_const): Replace
CODE_FOR_altivec_vmrghb_direct by
CODE_FOR_altivec_vmrghb_direct_be for BE and
CODE_FOR_altivec_vmrghb_direct_le for LE.  And replace
CODE_FOR_altivec_vmrglb_direct by
CODE_FOR_altivec_vmrglb_direct_be for BE and
CODE_FOR_altivec_vmrglb_direct_le for LE.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr106069-1.c: New test.

[Bug target/115355] [12/13/14/15 Regression] vectorization exposes wrong code on P9 LE starting from r12-4496

2024-06-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115355

--- Comment #14 from GCC Commits  ---
The master branch has been updated by Kewen Lin :

https://gcc.gnu.org/g:62520e4e9f7e2fe8a16ee57a4bd35da2e921ae22

commit r15-1644-g62520e4e9f7e2fe8a16ee57a4bd35da2e921ae22
Author: Kewen Lin 
Date:   Wed Jun 26 02:16:17 2024 -0500

rs6000: Fix wrong RTL patterns for vector merge high/low char on LE

Commit r12-4496 changes some define_expands and define_insns
for vector merge high/low char, which are altivec_vmrg[hl]b.
These defines are mainly for built-in function vec_merge{h,l}
and some internal gen function needs.  These functions should
consider endianness, taking vec_mergeh as example, as PVIPR
defines, vec_mergeh "Merges the first halves (in element order)
of two vectors", it does note it's in element order.  So it's
mapped into vmrghb on BE while vmrglb on LE respectively.
Although the mapped insns are different, as the discussion in
PR106069, the RTL pattern should be still the same, it is
conformed before commit r12-4496, but gets changed into
different patterns on BE and LE starting from commit r12-4496.
Similar to 32-bit element case in commit log of r15-1504, this
8-bit element pattern on LE doesn't actually match what the
underlying insn is intended to represent, once some optimization
like combine does some changes basing on it, it would cause
the unexpected consequence.  The newly constructed test case
pr106069-1.c is a typical example for this issue.

So this patch is to fix the wrong RTL pattern, ensure the
associated RTL patterns become the same as before which can
have the same semantic as their mapped insns.  With the
proposed patch, the expanders like altivec_vmrghb expands
into altivec_vmrghb_direct_be or altivec_vmrglb_direct_le
depending on endianness, "direct" can easily show which
insn would be generated, _be and _le are mainly for the
different RTL patterns as endianness.

Co-authored-by: Xionghu Luo 

PR target/106069
PR target/115355

gcc/ChangeLog:

* config/rs6000/altivec.md (altivec_vmrghb_direct): Rename to ...
(altivec_vmrghb_direct_be): ... this.  Add condition
BYTES_BIG_ENDIAN.
(altivec_vmrghb_direct_le): New define_insn.
(altivec_vmrglb_direct): Rename to ...
(altivec_vmrglb_direct_be): ... this.  Add condition
BYTES_BIG_ENDIAN.
(altivec_vmrglb_direct_le): New define_insn.
(altivec_vmrghb): Adjust by calling gen_altivec_vmrghb_direct_be
for BE and gen_altivec_vmrglb_direct_le for LE.
(altivec_vmrglb): Adjust by calling gen_altivec_vmrglb_direct_be
for BE and gen_altivec_vmrghb_direct_le for LE.
* config/rs6000/rs6000.cc (altivec_expand_vec_perm_const): Replace
CODE_FOR_altivec_vmrghb_direct by
CODE_FOR_altivec_vmrghb_direct_be for BE and
CODE_FOR_altivec_vmrghb_direct_le for LE.  And replace
CODE_FOR_altivec_vmrglb_direct by
CODE_FOR_altivec_vmrglb_direct_be for BE and
CODE_FOR_altivec_vmrglb_direct_le for LE.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr106069-1.c: New test.

[Bug target/115355] [12/13/14/15 Regression] vectorization exposes wrong code on P9 LE starting from r12-4496

2024-06-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115355

--- Comment #15 from GCC Commits  ---
The master branch has been updated by Kewen Lin :

https://gcc.gnu.org/g:812c70bf4981958488331d4ea5af8709b5321da1

commit r15-1645-g812c70bf4981958488331d4ea5af8709b5321da1
Author: Kewen Lin 
Date:   Wed Jun 26 02:16:17 2024 -0500

rs6000: Fix wrong RTL patterns for vector merge high/low short on LE

Commit r12-4496 changes some define_expands and define_insns
for vector merge high/low short, which are altivec_vmrg[hl]h.
These defines are mainly for built-in function vec_merge{h,l}
and some internal gen function needs.  These functions should
consider endianness, taking vec_mergeh as example, as PVIPR
defines, vec_mergeh "Merges the first halves (in element order)
of two vectors", it does note it's in element order.  So it's
mapped into vmrghh on BE while vmrglh on LE respectively.
Although the mapped insns are different, as the discussion in
PR106069, the RTL pattern should be still the same, it is
conformed before commit r12-4496, but gets changed into
different patterns on BE and LE starting from commit r12-4496.
Similar to 32-bit element case in commit log of r15-1504, this
16-bit element pattern on LE doesn't actually match what the
underlying insn is intended to represent, once some optimization
like combine does some changes basing on it, it would cause
the unexpected consequence.  The newly constructed test case
pr106069-2.c is a typical example for this issue on element type
short.

So this patch is to fix the wrong RTL pattern, ensure the
associated RTL patterns become the same as before which can
have the same semantic as their mapped insns.  With the
proposed patch, the expanders like altivec_vmrghh expands
into altivec_vmrghh_direct_be or altivec_vmrglh_direct_le
depending on endianness, "direct" can easily show which
insn would be generated, _be and _le are mainly for the
different RTL patterns as endianness.

Co-authored-by: Xionghu Luo 

PR target/106069
PR target/115355

gcc/ChangeLog:

* config/rs6000/altivec.md (altivec_vmrghh_direct): Rename to ...
(altivec_vmrghh_direct_be): ... this.  Add condition
BYTES_BIG_ENDIAN.
(altivec_vmrghh_direct_le): New define_insn.
(altivec_vmrglh_direct): Rename to ...
(altivec_vmrglh_direct_be): ... this.  Add condition
BYTES_BIG_ENDIAN.
(altivec_vmrglh_direct_le): New define_insn.
(altivec_vmrghh): Adjust by calling gen_altivec_vmrghh_direct_be
for BE and gen_altivec_vmrglh_direct_le for LE.
(altivec_vmrglh): Adjust by calling gen_altivec_vmrglh_direct_be
for BE and gen_altivec_vmrghh_direct_le for LE.
(vec_widen_umult_hi_v16qi): Adjust the call to
gen_altivec_vmrghh_direct by gen_altivec_vmrghh for BE
and by gen_altivec_vmrglh for LE.
(vec_widen_smult_hi_v16qi): Likewise.
(vec_widen_umult_lo_v16qi): Adjust the call to
gen_altivec_vmrglh_direct by gen_altivec_vmrglh for BE
and by gen_altivec_vmrghh for LE.
(vec_widen_smult_lo_v16qi): Likewise.
* config/rs6000/rs6000.cc (altivec_expand_vec_perm_const): Replace
CODE_FOR_altivec_vmrghh_direct by
CODE_FOR_altivec_vmrghh_direct_be for BE and
CODE_FOR_altivec_vmrghh_direct_le for LE.  And replace
CODE_FOR_altivec_vmrglh_direct by
CODE_FOR_altivec_vmrglh_direct_be for BE and
CODE_FOR_altivec_vmrglh_direct_le for LE.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr106069-2.c: New test.

[Bug target/106069] [12/13/14/15 Regression] wrong code with -O -fno-tree-forwprop -maltivec on ppc64le

2024-06-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069

--- Comment #42 from GCC Commits  ---
The master branch has been updated by Kewen Lin :

https://gcc.gnu.org/g:812c70bf4981958488331d4ea5af8709b5321da1

commit r15-1645-g812c70bf4981958488331d4ea5af8709b5321da1
Author: Kewen Lin 
Date:   Wed Jun 26 02:16:17 2024 -0500

rs6000: Fix wrong RTL patterns for vector merge high/low short on LE

Commit r12-4496 changes some define_expands and define_insns
for vector merge high/low short, which are altivec_vmrg[hl]h.
These defines are mainly for built-in function vec_merge{h,l}
and some internal gen function needs.  These functions should
consider endianness, taking vec_mergeh as example, as PVIPR
defines, vec_mergeh "Merges the first halves (in element order)
of two vectors", it does note it's in element order.  So it's
mapped into vmrghh on BE while vmrglh on LE respectively.
Although the mapped insns are different, as the discussion in
PR106069, the RTL pattern should be still the same, it is
conformed before commit r12-4496, but gets changed into
different patterns on BE and LE starting from commit r12-4496.
Similar to 32-bit element case in commit log of r15-1504, this
16-bit element pattern on LE doesn't actually match what the
underlying insn is intended to represent, once some optimization
like combine does some changes basing on it, it would cause
the unexpected consequence.  The newly constructed test case
pr106069-2.c is a typical example for this issue on element type
short.

So this patch is to fix the wrong RTL pattern, ensure the
associated RTL patterns become the same as before which can
have the same semantic as their mapped insns.  With the
proposed patch, the expanders like altivec_vmrghh expands
into altivec_vmrghh_direct_be or altivec_vmrglh_direct_le
depending on endianness, "direct" can easily show which
insn would be generated, _be and _le are mainly for the
different RTL patterns as endianness.

Co-authored-by: Xionghu Luo 

PR target/106069
PR target/115355

gcc/ChangeLog:

* config/rs6000/altivec.md (altivec_vmrghh_direct): Rename to ...
(altivec_vmrghh_direct_be): ... this.  Add condition
BYTES_BIG_ENDIAN.
(altivec_vmrghh_direct_le): New define_insn.
(altivec_vmrglh_direct): Rename to ...
(altivec_vmrglh_direct_be): ... this.  Add condition
BYTES_BIG_ENDIAN.
(altivec_vmrglh_direct_le): New define_insn.
(altivec_vmrghh): Adjust by calling gen_altivec_vmrghh_direct_be
for BE and gen_altivec_vmrglh_direct_le for LE.
(altivec_vmrglh): Adjust by calling gen_altivec_vmrglh_direct_be
for BE and gen_altivec_vmrghh_direct_le for LE.
(vec_widen_umult_hi_v16qi): Adjust the call to
gen_altivec_vmrghh_direct by gen_altivec_vmrghh for BE
and by gen_altivec_vmrglh for LE.
(vec_widen_smult_hi_v16qi): Likewise.
(vec_widen_umult_lo_v16qi): Adjust the call to
gen_altivec_vmrglh_direct by gen_altivec_vmrglh for BE
and by gen_altivec_vmrghh for LE.
(vec_widen_smult_lo_v16qi): Likewise.
* config/rs6000/rs6000.cc (altivec_expand_vec_perm_const): Replace
CODE_FOR_altivec_vmrghh_direct by
CODE_FOR_altivec_vmrghh_direct_be for BE and
CODE_FOR_altivec_vmrghh_direct_le for LE.  And replace
CODE_FOR_altivec_vmrglh_direct by
CODE_FOR_altivec_vmrglh_direct_be for BE and
CODE_FOR_altivec_vmrglh_direct_le for LE.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr106069-2.c: New test.

[Bug c/115646] [gcc][trunk] ICE in gen_conditions_for_pow_int_base, at tree-call-cdce.cc:587

2024-06-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115646

--- Comment #2 from GCC Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:453b1d291d1a0f89087ad91cf6b1bed1ec68eff3

commit r15-1643-g453b1d291d1a0f89087ad91cf6b1bed1ec68eff3
Author: Richard Biener 
Date:   Tue Jun 25 16:13:02 2024 +0200

tree-optimization/115646 - ICE with pow shrink-wrapping from bitfield

The following makes analysis and transform agree on constraints.

PR tree-optimization/115646
* tree-call-cdce.cc (check_pow): Check for bit_sz values
as allowed by transform.

* gcc.dg/pr115646.c: New testcase.

[Bug tree-optimization/113281] [11 Regression] Latent wrong code due to vectorization of shift reduction and missing promotions since r9-1590

2024-06-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

--- Comment #34 from GCC Commits  ---
The master branch has been updated by Alexandre Oliva :

https://gcc.gnu.org/g:54d2339c9f87f702e02e571a5460e11c19e1c02f

commit r15-1639-g54d2339c9f87f702e02e571a5460e11c19e1c02f
Author: Alexandre Oliva 
Date:   Wed Jun 26 02:08:18 2024 -0300

[testsuite] [arm] [vect] adjust mve-vshr test [PR113281]

The test was too optimistic, alas.  We used to vectorize shifts by
clamping the shift counts below the bit width of the types (e.g. at 15
for 16-bit vector elements), but (uint16_t)32768 >> (uint16_t)16 is
well defined (because of promotion to 32-bit int) and must yield 0,
not 1 (as before the fix).

Unfortunately, in the gimple model of vector units, such large shift
counts wouldn't be well-defined, so we won't vectorize such shifts any
more, unless we can tell they're in range or undefined.

So the test that expected the vectorization we no longer performed
needs to be adjusted.  Instead of nobbling the test, Richard Earnshaw
suggested annotating the test with the expected ranges so as to enable
the optimization, and Christophe Lyon suggested a further
simplification.


Co-Authored-By: Richard Earnshaw 

for  gcc/testsuite/ChangeLog

PR tree-optimization/113281
* gcc.target/arm/simd/mve-vshr.c: Add expected ranges.

[Bug target/114189] Target implements obsolete vcond{,u,eq} expanders

2024-06-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114189

--- Comment #6 from GCC Commits  ---
The master branch has been updated by hongtao Liu :

https://gcc.gnu.org/g:aac00d09859cc5934bd0f7493d537b8430337773

commit r15-1638-gaac00d09859cc5934bd0f7493d537b8430337773
Author: liuhongt 
Date:   Thu Jun 20 12:41:13 2024 +0800

Optimize a < 0 ? -1 : 0 to (signed)a >> 31.

Try to optimize x < 0 ? -1 : 0 into (signed) x >> 31
and x < 0 ? 1 : 0 into (unsigned) x >> 31.

Move the optimization did in ix86_expand_int_vcond to match.pd

gcc/ChangeLog:

PR target/114189
* match.pd: Simplify a < 0 ? -1 : 0 to (signed) >> 31 and a <
0 ? 1 : 0 to (unsigned) a >> 31 for vector integer type.

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx2-pr115517.c: New test.
* gcc.target/i386/avx512-pr115517.c: New test.
* g++.target/i386/avx2-pr115517.C: New test.
* g++.target/i386/avx512-pr115517.C: New test.
* g++.dg/tree-ssa/pr88152-1.C: Adjust testcase.

[Bug testsuite/109360] RFE: check that generated .sarif files validate against the SARIF schema

2024-06-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109360

--- Comment #4 from GCC Commits  ---
The master branch has been updated by David Malcolm :

https://gcc.gnu.org/g:17967907102099806dc80c71ee7665ffb22ffa23

commit r15-1633-g17967907102099806dc80c71ee7665ffb22ffa23
Author: David Malcolm 
Date:   Tue Jun 25 20:26:21 2024 -0400

testsuite: use check-jsonschema for validating .sarif files [PR109360]

As reported here:
  https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655434.html
the schema validation I added for generated .sarif files in
r15-1541-ga84fe222029ff2 used the "jsonschema" command line tool, which
has been deprecated by more recent versions of the Python 3 "jsonschema"
module.

This patch updates the validation to use the more recent
"check-jsonschema" command line tool, from the Python 3 "check-jsonschema"
module, fixing the testsuite FAILs due to the deprecation message.

As an added bonus, the output on validation failures is *much* nicer, e.g.
if I undo r15-1540-g9f4fdc3acebcf6, the error messages begin like this:
verify-sarif-file: res: Schema validation errors were encountered.
 
diagnostic-format-sarif-file-bad-utf8-pr109098-1.c.sarif::$.runs[0].results[0].locations[0].physicalLocation.region.startColumn:
0 is less than the minimum of 1
 
diagnostic-format-sarif-file-bad-utf8-pr109098-1.c.sarif::$.runs[0].results[0].relatedLocations[0].physicalLocation.region.startColumn:
0 is less than the minimum of 1
 
diagnostic-format-sarif-file-bad-utf8-pr109098-1.c.sarif::$.runs[0].results[0].relatedLocations[1].physicalLocation.region.startColumn:
0 is less than the minimum of 1
 
diagnostic-format-sarif-file-bad-utf8-pr109098-1.c.sarif::$.runs[0].results[0].relatedLocations[2].physicalLocation.region.startColumn:
0 is less than the minimum of 1
child process exited abnormally
FAIL: c-c++-common/diagnostic-format-sarif-file-bad-utf8-pr109098-1.c 
-Wc++-compat   (test .sarif output against SARIF schema)

Tested with Python 3.8 with check_jsonschema 0.28.6

gcc/ChangeLog:
PR testsuite/109360
* doc/install.texi (Python3 modules): Update SARIF validation
requirement to use check-jsonschema rather than jsonschema.

gcc/testsuite/ChangeLog:
PR testsuite/109360
* lib/scansarif.exp (verify-sarif-file): Use check-jsonschema
rather than jsonschema, updating the invocation accordingly.
* lib/target-supports.exp (check_effective_target_jsonschema):
Convert
to...
(check_effective_target_check_jsonschema): ...this.

Signed-off-by: David Malcolm

[Bug c++/115504] [14/15 Regression] Wrong decltype result for a captured reference inside lambda since r14-5330

2024-06-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115504

--- Comment #3 from GCC Commits  ---
The master branch has been updated by Patrick Palka :

https://gcc.gnu.org/g:737449e5f233feb682b5dd2cc153892ad90a79bd

commit r15-1631-g737449e5f233feb682b5dd2cc153892ad90a79bd
Author: Patrick Palka 
Date:   Tue Jun 25 20:07:15 2024 -0400

c++: decltype of capture proxy of ref [PR115504]

The finish_decltype_type capture proxy handling added in r14-5330 was
incorrectly stripping references in the type of the captured variable.

PR c++/115504

gcc/cp/ChangeLog:

* semantics.cc (finish_decltype_type): Don't strip the reference
type (if any) of a capture proxy's captured variable.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/decltype-auto8.C: New test.

Reviewed-by: Jason Merrill

[Bug c/115587] [14/15 Regression] Possible uninitialized variable (decl) in c_parser_omp_loop_nest since r14-3489-g143151ac2013c2

2024-06-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115587

--- Comment #5 from GCC Commits  ---
The releases/gcc-14 branch has been updated by Sandra Loosemore
:

https://gcc.gnu.org/g:b383719aebe45bbe8cc3944e515ed7caa30e8744

commit r14-10346-gb383719aebe45bbe8cc3944e515ed7caa30e8744
Author: Sandra Loosemore 
Date:   Tue Jun 25 13:54:43 2024 +

Fix PR c/115587, uninitialized variable in c_parser_omp_loop_nest

This function had a reference to an uninitialized variable on the
error path.  The problem was diagnosed by clang but not gcc.  It seems
the cleanest solution is to initialize all the loop-clause variables
at the point of declaration rather than at different places in the
code.

The C++ front end didn't have this problem, but I've made similar
changes there to keep the code in sync.

gcc/c/ChangeLog:

PR c/115587
* c-parser.cc (c_parser_omp_loop_nest): Move initializations to
point of declaration.

gcc/cp/ChangeLog:

PR c/115587
* parser.cc (cp_parser_omp_loop_nest): Move initializations to
point of declaration.

(cherry picked from commit 21f1073d388af8af207183b0ed592e1cc47d20ab)

[Bug c++/115476] [13/14/15 Regression] __has_unique_object_representation ICE with array of uninstantiated type of unknown bound

2024-06-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115476

--- Comment #9 from GCC Commits  ---
The trunk branch has been updated by Marek Polacek :

https://gcc.gnu.org/g:fc382a373e6824bb998007d1dcb0805b0cf4b8e8

commit r15-1625-gfc382a373e6824bb998007d1dcb0805b0cf4b8e8
Author: Marek Polacek 
Date:   Mon Jun 17 17:53:12 2024 -0400

c++: ICE with __has_unique_object_representations [PR115476]

Here we started to ICE with r13-25: in check_trait_type, for "X[]" we
return true here:

  if (kind == 1 && TREE_CODE (type) == ARRAY_TYPE && !TYPE_DOMAIN (type))
return true; // Array of unknown bound. Don't care about completeness.

and then end up crashing in record_has_unique_obj_representations:

4836  if (cur != wi::to_offset (sz))

because sz is null.

   
https://eel.is/c++draft/type.traits#tab:meta.unary.prop-row-47-column-3-sentence-1
says that the preconditions for __has_unique_object_representations are:
"T shall be a complete type, cv void, or an array of unknown bound" and
that "For an array type T, the same result as
has_unique_object_representations_v>" so T[]
should be treated as T.  So we should use kind==2 for the trait.

PR c++/115476

gcc/cp/ChangeLog:

* semantics.cc (finish_trait_expr)
: Move below to call
check_trait_type with kind==2.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/has-unique-obj-representations4.C: New test.

[Bug modula2/115540] "gcc/m2/mc-boot-ch/Gtermios.cc:292:20: error: return-statement with a value, in function returning 'void' [-fpermissive]" when HAVE_CFMAKERAW is defined

2024-06-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115540

--- Comment #3 from GCC Commits  ---
The master branch has been updated by Gaius Mulley :

https://gcc.gnu.org/g:d16355c72c7f7b54ecf06371d14d7ad309ea4c34

commit r15-1623-gd16355c72c7f7b54ecf06371d14d7ad309ea4c34
Author: Gaius Mulley 
Date:   Tue Jun 25 21:37:44 2024 +0100

PR modula2/115540 gcc/m2/mc-boot-ch/Gtermios.cc error return-statement with
a value

This patch fixes three occurrences of cfmakeraw use in the hand built
m2 support libraries which incorrectly attempt to return a void
result.

gcc/m2/ChangeLog:

PR modula2/115540
* gm2-libs-ch/termios.c (cfmakeraw): Remove return.
* mc-boot-ch/Gtermios.cc (cfmakeraw): Remove return.
* pge-boot/Gtermios.cc (cfmakeraw): Remove return.

Signed-off-by: Gaius Mulley

[Bug c++/115425] [13/14/15 regression] ICE: tree check: expected type_pack_expansion or expr_pack_expansion, have error_mark in tsubst_pack_expansion, at cp/pt.cc:13778

2024-06-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115425

--- Comment #4 from GCC Commits  ---
The trunk branch has been updated by Marek Polacek :

https://gcc.gnu.org/g:ed6ffc4e62f716d1b31d599d22594dd969da137f

commit r15-1621-ged6ffc4e62f716d1b31d599d22594dd969da137f
Author: Marek Polacek 
Date:   Fri Jun 14 17:50:29 2024 -0400

c++: ICE with generic lambda and pack expansion [PR115425]

In r13-272 we hardened the *_PACK_EXPANSION and *_ARGUMENT_PACK macros.
That trips up here because make_pack_expansion returns error_mark_node
and we access that with PACK_EXPANSION_LOCAL_P.

PR c++/115425

gcc/cp/ChangeLog:

* pt.cc (tsubst_pack_expansion): Return error_mark_node if
make_pack_expansion doesn't work out.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/lambda-generic12.C: New test.

[Bug c++/115501] [13/14/15 Regression] ICE: in build_call_a with dynamic_cast after invalid definition of cxxabiv1::dynamic_cast since r13-3299

2024-06-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115501

--- Comment #10 from GCC Commits  ---
The trunk branch has been updated by Marek Polacek :

https://gcc.gnu.org/g:71f484d02b2b3e8616cd7af27a0d4c72e4c7e977

commit r15-1620-g71f484d02b2b3e8616cd7af27a0d4c72e4c7e977
Author: Marek Polacek 
Date:   Tue Jun 18 10:50:49 2024 -0400

c++: ICE with __dynamic_cast redecl [PR115501]

Since r13-3299, build_dynamic_cast_1 calls pushdecl which calls
duplicate_decls and that in this testcase emits the "conflicting
declaration" error and returns error_mark_node, so the subsequent
build_cxx_call crashes on the error_mark_node.

PR c++/115501

gcc/cp/ChangeLog:

* rtti.cc (build_dynamic_cast_1): Return if dcast_fn is erroneous.

gcc/testsuite/ChangeLog:

* g++.dg/rtti/dyncast8.C: New test.

[Bug rtl-optimization/111673] assign_hard_reg() routine should scale save/restore costs of callee save registers with basic block frequency

2024-06-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111673

--- Comment #1 from GCC Commits  ---
The master branch has been updated by Surya Kumari Jangala
:

https://gcc.gnu.org/g:3b9b8d6cfdf59337f4b7ce10ce92a98044b2657b

commit r15-1619-g3b9b8d6cfdf59337f4b7ce10ce92a98044b2657b
Author: Surya Kumari Jangala 
Date:   Tue Jun 25 08:37:49 2024 -0500

ira: Scale save/restore costs of callee save registers with block frequency

In assign_hard_reg(), when computing the costs of the hard registers, the
cost of saving/restoring a callee-save hard register in prolog/epilog is
taken into consideration. However, this cost is not scaled with the entry
block frequency. Without scaling, the cost of saving/restoring is quite
small and this can result in a callee-save register being chosen by
assign_hard_reg() even though there are free caller-save registers
available. Assigning a callee save register to a pseudo that is live
in the entire function and across a call will cause shrink wrap to fail.

2024-06-25  Surya Kumari Jangala  

gcc/
PR rtl-optimization/111673
* ira-color.cc (assign_hard_reg): Scale save/restore costs of
callee save registers with block frequency.

gcc/testsuite/
PR rtl-optimization/111673
* gcc.target/powerpc/pr111673.c: New test.

[Bug modula2/115536] Expression is evaluated incorrectly when encountering relops and indirection

2024-06-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115536

--- Comment #3 from GCC Commits  ---
The master branch has been updated by Gaius Mulley :

https://gcc.gnu.org/g:9f168b412f44781013401492acfedf22afe7741b

commit r15-1618-g9f168b412f44781013401492acfedf22afe7741b
Author: Gaius Mulley 
Date:   Tue Jun 25 18:35:22 2024 +0100

PR modula2/115536 Expression is evaluated incorrectly when encountering
relops and indirection

This fix ensures that we only call BuildRelOpFromBoolean if we are
inside a constant expression (where no indirection can be used).
The fix creates a temporary variable when a boolean is created from
a relop in other cases.
The previous pattern implementation would not work if the operands required
dereferencing during non const expressions.  Comparison of relop results
in a constant expression are resolved by constant propagation, basic
block analysis and dead code removal.  After the quadruples have been
optimized only one assignment to the boolean variable will remain for
const expressions.  All quadruple pattern checking for boolean
expressions is removed by the patch.  Thus the implementation becomes
more generic.

gcc/m2/ChangeLog:

PR modula2/115536
* gm2-compiler/M2BasicBlock.def (GetBasicBlockScope): New
procedure.
(GetBasicBlockStart): Ditto.
(GetBasicBlockEnd): Ditto.
(IsBasicBlockFirst): New procedure function.
* gm2-compiler/M2BasicBlock.mod (ConvertQuads2BasicBlock): Allow
conditional boolean quads to be removed.
(GetBasicBlockScope): Implement new procedure.
(GetBasicBlockStart): Ditto.
(GetBasicBlockEnd): Ditto.
(IsBasicBlockFirst): Implement new procedure function.
* gm2-compiler/M2GCCDeclare.def (FoldConstants): New parameter
declaration.
* gm2-compiler/M2GCCDeclare.mod (FoldConstants): New parameter
declaration.
(DeclareTypesConstantsProceduresInRange): Recreate basic blocks
after resolving constant expressions.
(CodeBecomes): Guard IsVariableSSA with IsVar.
* gm2-compiler/M2GenGCC.def (ResolveConstantExpressions): New
parameter declaration.
* gm2-compiler/M2GenGCC.mod (FoldIfLess): Remove relop pattern
detection.
(FoldIfGre): Ditto.
(FoldIfLessEqu): Ditto.
(FoldIfGreEqu): Ditto.
(FoldIfIn): Ditto.
(FoldIfNotIn): Ditto.
(FoldIfEqu): Ditto.
(FoldIfNotEqu): Ditto.
(FoldBecomes): Add BasicBlock parameter and allow conditional
boolean becomes to be folded in the first basic block.
(ResolveConstantExpressions): Reimplement.
* gm2-compiler/M2Quads.def (IsConstQuad): New procedure function.
(IsConditionalBooleanQuad): Ditto.
* gm2-compiler/M2Quads.mod (IsConstQuad): Implement new procedure
function.
(IsConditionalBooleanQuad): Ditto.
(MoveWithMode): Use GenQuadOTypetok.
(IsInitialisingConst): Rewrite using OpUsesOp1.
(OpUsesOp1): New procedure function.
(doBuildAssignment): Mark des as a VarConditional.
(ConvertBooleanToVariable): Call PutVarConditional.
(DumpQuadSummary): New procedure.
(BuildRelOpFromBoolean): Updated debugging and improved comments.
(BuildRelOp): Only call BuildRelOpFromBoolean if we are in a const
expression and both operands are boolean relops.
(GenQuadOTypeUniquetok): New procedure.
(BackPatch): Correct comment.
* gm2-compiler/SymbolTable.def (PutVarConditional): New procedure.
(IsVarConditional): New procedure function.
* gm2-compiler/SymbolTable.mod (PutVarConditional): Implement new
procedure.
(IsVarConditional): Implement new procedure function.
(SymConstVar): New field IsConditional.
(SymVar): New field IsConditional.
(MakeVar): Initialize IsConditional field.
(MakeConstVar): Initialize IsConditional field.
* gm2-compiler/M2Swig.mod (DoBasicBlock): Change parameters to
use BasicBlock.
* gm2-compiler/M2Code.mod (SecondDeclareAndOptimize): Use iterator
to FoldConstants over basic block list.
* gm2-compiler/M2SymInit.mod (AppendEntry): Replace parameters
with BasicBlock.
* gm2-compiler/P3Build.bnf (Relation): Call RecordOp for #, <> and
=.

gcc/testsuite/ChangeLog:

PR modula2/115536
* gm2/iso/const/pass/constbool4.mod: New test.
* gm2/iso/const/pass/constbool5.mod: New test.
* gm2/iso/run/pass/condtest2.mod: New test.
* gm2/iso/run/pass/condtest3.mod: New test.
*

[Bug c++/115198] Class template argument deduction fails for copy ctor when used with an alias template if the aliased class template has explicitly defaulted copy ctor

2024-06-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115198

--- Comment #5 from GCC Commits  ---
The master branch has been updated by Patrick Palka :

https://gcc.gnu.org/g:06ebb7c6f31fe42ffdea6f51ac1ba1f6b058c090

commit r15-1615-g06ebb7c6f31fe42ffdea6f51ac1ba1f6b058c090
Author: Patrick Palka 
Date:   Tue Jun 25 12:59:24 2024 -0400

c++: alias CTAD and copy deduction guide [PR115198]

Here we're neglecting to update DECL_NAME during the alias CTAD guide
transformation, which causes copy_guide_p to return false for the
transformed copy deduction guide since DECL_NAME is still __dguide_C
with TREE_TYPE C but it should be __dguide_A with TREE_TYPE A
(i.e. C).  This ultimately results in ambiguity during
overload resolution between the copy deduction guide vs copy ctor guide.

This patch makes us update DECL_NAME of a transformed guide accordingly
during alias/inherited CTAD.

PR c++/115198

gcc/cp/ChangeLog:

* pt.cc (alias_ctad_tweaks): Update DECL_NAME of the transformed
guides.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/class-deduction-alias22.C: New test.

Reviewed-by: Jason Merrill

[Bug c++/115358] [13/14/15 Regression] template argument deduction/substitution failed in generic lambda function use of static constexpr array type whos initializer defines the size since r13-2540

2024-06-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115358

--- Comment #8 from GCC Commits  ---
The master branch has been updated by Patrick Palka :

https://gcc.gnu.org/g:e3915c1ad56591cbd68229a64c941c38330abd69

commit r15-1614-ge3915c1ad56591cbd68229a64c941c38330abd69
Author: Patrick Palka 
Date:   Tue Jun 25 10:42:21 2024 -0400

c++: using non-dep array var of unknown bound [PR115358]

For a non-dependent array variable of unknown bound, it seems we need to
try instantiating its definition upon use in a template context for sake
of proper checking and typing of the overall expression, like we do for
function specializations with deduced return type.

PR c++/115358

gcc/cp/ChangeLog:

* decl2.cc (mark_used): Call maybe_instantiate_decl for an array
variable with unknown bound.
* semantics.cc (finish_decltype_type): Remove now redundant
handling of array variables with unknown bound.
* typeck.cc (cxx_sizeof_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/template/array37.C: New test.

Reviewed-by: Jason Merrill

[Bug c/115587] [14/15 Regression] Possible uninitialized variable (decl) in c_parser_omp_loop_nest since r14-3489-g143151ac2013c2

2024-06-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115587

--- Comment #3 from GCC Commits  ---
The master branch has been updated by Sandra Loosemore :

https://gcc.gnu.org/g:21f1073d388af8af207183b0ed592e1cc47d20ab

commit r15-1613-g21f1073d388af8af207183b0ed592e1cc47d20ab
Author: Sandra Loosemore 
Date:   Tue Jun 25 13:54:43 2024 +

Fix PR c/115587, uninitialized variable in c_parser_omp_loop_nest

This function had a reference to an uninitialized variable on the
error path.  The problem was diagnosed by clang but not gcc.  It seems
the cleanest solution is to initialize all the loop-clause variables
at the point of declaration rather than at different places in the
code.

The C++ front end didn't have this problem, but I've made similar
changes there to keep the code in sync.

gcc/c/ChangeLog:

PR c/115587
* c-parser.cc (c_parser_omp_loop_nest): Move initializations to
point of declaration.

gcc/cp/ChangeLog:

PR c/115587
* parser.cc (cp_parser_omp_loop_nest): Move initializations to
point of declaration.

[Bug target/115608] ICE in extract_insn, at recog.cc:2812 when building with -mv8plus

2024-06-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115608

--- Comment #9 from GCC Commits  ---
The releases/gcc-14 branch has been updated by Eric Botcazou
:

https://gcc.gnu.org/g:4bf93fc3d360dbeb5c07303c1b5028989c575ac1

commit r14-10345-g4bf93fc3d360dbeb5c07303c1b5028989c575ac1
Author: Eric Botcazou 
Date:   Tue Jun 25 11:47:48 2024 +0200

SPARC: fix internal error with -mv8plus on 64-bit Linux

This passes -m32 when -mv8plus is specified on Linux (like on Solaris).

gcc/
PR target/115608
* config/sparc/linux64.h (CC1_SPEC): Pass -m32 for -mv8plus.

[Bug target/115608] ICE in extract_insn, at recog.cc:2812 when building with -mv8plus

2024-06-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115608

--- Comment #8 from GCC Commits  ---
The master branch has been updated by Eric Botcazou :

https://gcc.gnu.org/g:d4db77ce37a65207baea88859fd9c191469187f8

commit r15-1608-gd4db77ce37a65207baea88859fd9c191469187f8
Author: Eric Botcazou 
Date:   Tue Jun 25 11:47:48 2024 +0200

SPARC: fix internal error with -mv8plus on 64-bit Linux

This passes -m32 when -mv8plus is specified on Linux (like on Solaris).

gcc/
PR target/115608
* config/sparc/linux64.h (CC1_SPEC): Pass -m32 for -mv8plus.

[Bug rtl-optimization/106594] [13/14/15 Regression] sign-extensions no longer merged into addressing mode

2024-06-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106594

--- Comment #31 from GCC Commits  ---
The trunk branch has been updated by Thomas Schwinge :

https://gcc.gnu.org/g:70480055636c2ca79761cb4440e930daa16bb7aa

commit r15-1607-g70480055636c2ca79761cb4440e930daa16bb7aa
Author: Thomas Schwinge 
Date:   Tue Jun 25 10:55:41 2024 +0200

rs6000: Properly default-disable late-combine passes [PR106594, PR115622,
PR115633]

..., so that it also works for '__attribute__ ((optimize("[...]")))' etc.

PR target/106594
PR target/115622
PR target/115633
gcc/
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Move
default-disable of late-combine passes from here...
(rs6000_override_options_after_change): ... to here.

[Bug target/115633] [15 Regression] powerpc64le: "relocation truncated to fit: R_PPC64_TOC16 against `.rodata.cst4'" with (default) '-flate-combine-instructions' since r15-1579-g792f97b44ffc5e6a967292

2024-06-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115633

--- Comment #3 from GCC Commits  ---
The trunk branch has been updated by Thomas Schwinge :

https://gcc.gnu.org/g:70480055636c2ca79761cb4440e930daa16bb7aa

commit r15-1607-g70480055636c2ca79761cb4440e930daa16bb7aa
Author: Thomas Schwinge 
Date:   Tue Jun 25 10:55:41 2024 +0200

rs6000: Properly default-disable late-combine passes [PR106594, PR115622,
PR115633]

..., so that it also works for '__attribute__ ((optimize("[...]")))' etc.

PR target/106594
PR target/115622
PR target/115633
gcc/
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Move
default-disable of late-combine passes from here...
(rs6000_override_options_after_change): ... to here.

[Bug other/115622] gcc.dg/ipa/iinline-attr.c fails after r15-1579-g792f97b44ffc5e

2024-06-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115622

--- Comment #2 from GCC Commits  ---
The trunk branch has been updated by Thomas Schwinge :

https://gcc.gnu.org/g:70480055636c2ca79761cb4440e930daa16bb7aa

commit r15-1607-g70480055636c2ca79761cb4440e930daa16bb7aa
Author: Thomas Schwinge 
Date:   Tue Jun 25 10:55:41 2024 +0200

rs6000: Properly default-disable late-combine passes [PR106594, PR115622,
PR115633]

..., so that it also works for '__attribute__ ((optimize("[...]")))' etc.

PR target/106594
PR target/115622
PR target/115633
gcc/
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Move
default-disable of late-combine passes from here...
(rs6000_override_options_after_change): ... to here.

[Bug c/114930] [14/15 regression] ICE in fld_incomplete_type_of when building libwebp with -std=c23 -flto

2024-06-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114930

--- Comment #8 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:777cc6a01d1cf783a36d0fa67ab20f0312f35d7a

commit r15-1597-g777cc6a01d1cf783a36d0fa67ab20f0312f35d7a
Author: Jakub Jelinek 
Date:   Tue Jun 25 08:35:56 2024 +0200

c: Fix ICE related to incomplete structures in C23 [PR114930]

Here is a version of the c_update_type_canonical fixes which passed
bootstrap/regtest.
The non-trivial part is the handling of the case when
build_qualified_type (TYPE_CANONICAL (t), TYPE_QUALS (x))
returns a type with NULL TYPE_CANONICAL.  That should happen only
if TYPE_CANONICAL (t) == t, because otherwise c_update_type_canonical
should
have been already called on the other type.  c, the returned type, is
usually x
and in that case it should have TYPE_CANONICAL set to itself, or worst
for whatever reason x is not the right canonical type (say it has
attributes
or whatever disqualifies it from check_qualified_type).  In that case
either it finds some pre-existing type from the variant chain of t which
is later in the chain and we haven't processed it yet (but then
get_qualified_type moves it right after t in:
/* Put the found variant at the head of the variant list so
   frequently searched variants get found faster.  The C++ FE
   benefits greatly from this.  */
tree t = *tp;
*tp = TYPE_NEXT_VARIANT (t);
TYPE_NEXT_VARIANT (t) = TYPE_NEXT_VARIANT (mv);
TYPE_NEXT_VARIANT (mv) = t;
return t;
optimization), or creates a fresh new type using build_variant_type_copy,
which again places the new type right after t:
  /* Add the new type to the chain of variants of TYPE.  */
  TYPE_NEXT_VARIANT (t) = TYPE_NEXT_VARIANT (m);
  TYPE_NEXT_VARIANT (m) = t;
  TYPE_MAIN_VARIANT (t) = m;
At this point we want to make c its own canonical type (i.e. TYPE_CANONICAL
(c) = c;), but also need to process pointers to it and only then return
back
to processing x.  Processing the whole chain from c again could be costly,
we could have hundreds of types in the chain already processed, and while
the loop would just quickly skip them
  for (tree x = t, l = NULL_TREE; x; l = x, x = TYPE_NEXT_VARIANT (x))
{
  if (x != t && TYPE_STRUCTURAL_EQUALITY_P (x))
...
  else if (x != t)
continue;
it feels costly.  So, this patch instead moves c from right after t
to right before x in the chain (that shouldn't change anything, because
clearly build_qualified_type didn't find any matches in the chain before
x) and continues processing the c at that position, so should handle the
x that encountered this in the next iteration.

We could avoid some of the moving in the chain if we processed the chain
twice, once deal only with x != t && TYPE_STRUCTURAL_EQUALITY_P (x)
&& TYPE_CANONICAL (t) == t && check_qualified_type (t, x, TYPE_QUALS (x))
types (in that case set TYPE_CANONICAL (x) = x) and once the rest.  There
is still the theoretical case where build_qualified_type would return
a new type and in that case we are back to the moving the type around and
needing to handle it though.

2024-06-25  Jakub Jelinek  
Martin Uecker  

PR c/114930
PR c/115502
gcc/c/
* c-decl.cc (c_update_type_canonical): Assert t is main variant
with 0 TYPE_QUALS.  Simplify and don't use check_qualified_type.
Deal with the case where build_qualified_type returns
TYPE_STRUCTURAL_EQUALITY_P type.
gcc/testsuite/
* gcc.dg/pr114574-1.c: Require lto effective target.
* gcc.dg/pr114574-2.c: Likewise.
* gcc.dg/pr114930.c: New test.
* gcc.dg/pr115502.c: New test.

[Bug c/115502] [15 regression] ICE when building Valgrind with -std=c23 (comptypes_same_p, at c/c-typeck.cc:1227) since r15-934-gd2cfe8a73b3c41

2024-06-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115502

--- Comment #7 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:777cc6a01d1cf783a36d0fa67ab20f0312f35d7a

commit r15-1597-g777cc6a01d1cf783a36d0fa67ab20f0312f35d7a
Author: Jakub Jelinek 
Date:   Tue Jun 25 08:35:56 2024 +0200

c: Fix ICE related to incomplete structures in C23 [PR114930]

Here is a version of the c_update_type_canonical fixes which passed
bootstrap/regtest.
The non-trivial part is the handling of the case when
build_qualified_type (TYPE_CANONICAL (t), TYPE_QUALS (x))
returns a type with NULL TYPE_CANONICAL.  That should happen only
if TYPE_CANONICAL (t) == t, because otherwise c_update_type_canonical
should
have been already called on the other type.  c, the returned type, is
usually x
and in that case it should have TYPE_CANONICAL set to itself, or worst
for whatever reason x is not the right canonical type (say it has
attributes
or whatever disqualifies it from check_qualified_type).  In that case
either it finds some pre-existing type from the variant chain of t which
is later in the chain and we haven't processed it yet (but then
get_qualified_type moves it right after t in:
/* Put the found variant at the head of the variant list so
   frequently searched variants get found faster.  The C++ FE
   benefits greatly from this.  */
tree t = *tp;
*tp = TYPE_NEXT_VARIANT (t);
TYPE_NEXT_VARIANT (t) = TYPE_NEXT_VARIANT (mv);
TYPE_NEXT_VARIANT (mv) = t;
return t;
optimization), or creates a fresh new type using build_variant_type_copy,
which again places the new type right after t:
  /* Add the new type to the chain of variants of TYPE.  */
  TYPE_NEXT_VARIANT (t) = TYPE_NEXT_VARIANT (m);
  TYPE_NEXT_VARIANT (m) = t;
  TYPE_MAIN_VARIANT (t) = m;
At this point we want to make c its own canonical type (i.e. TYPE_CANONICAL
(c) = c;), but also need to process pointers to it and only then return
back
to processing x.  Processing the whole chain from c again could be costly,
we could have hundreds of types in the chain already processed, and while
the loop would just quickly skip them
  for (tree x = t, l = NULL_TREE; x; l = x, x = TYPE_NEXT_VARIANT (x))
{
  if (x != t && TYPE_STRUCTURAL_EQUALITY_P (x))
...
  else if (x != t)
continue;
it feels costly.  So, this patch instead moves c from right after t
to right before x in the chain (that shouldn't change anything, because
clearly build_qualified_type didn't find any matches in the chain before
x) and continues processing the c at that position, so should handle the
x that encountered this in the next iteration.

We could avoid some of the moving in the chain if we processed the chain
twice, once deal only with x != t && TYPE_STRUCTURAL_EQUALITY_P (x)
&& TYPE_CANONICAL (t) == t && check_qualified_type (t, x, TYPE_QUALS (x))
types (in that case set TYPE_CANONICAL (x) = x) and once the rest.  There
is still the theoretical case where build_qualified_type would return
a new type and in that case we are back to the moving the type around and
needing to handle it though.

2024-06-25  Jakub Jelinek  
Martin Uecker  

PR c/114930
PR c/115502
gcc/c/
* c-decl.cc (c_update_type_canonical): Assert t is main variant
with 0 TYPE_QUALS.  Simplify and don't use check_qualified_type.
Deal with the case where build_qualified_type returns
TYPE_STRUCTURAL_EQUALITY_P type.
gcc/testsuite/
* gcc.dg/pr114574-1.c: Require lto effective target.
* gcc.dg/pr114574-2.c: Likewise.
* gcc.dg/pr114930.c: New test.
* gcc.dg/pr115502.c: New test.

[Bug c++/115624] '-Wnrvo' is not an option that controls warnings

2024-06-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115624

--- Comment #3 from GCC Commits  ---
The releases/gcc-14 branch has been updated by Andrew Pinski
:

https://gcc.gnu.org/g:b7157f3930762097210aa24a3f24ed5cafee6672

commit r14-10344-gb7157f3930762097210aa24a3f24ed5cafee6672
Author: Andrew Pinski 
Date:   Mon Jun 24 18:16:13 2024 -0700

c-family: Add Warning property to Wnrvo option [PR115624]

This was missing when Wnrvo was added in
r14-1594-g2ae5384d457b9c67586de012816dfc71a6943164 .

Pushed after a bootstrap/test on x86_64-linux-gnu.

gcc/c-family/ChangeLog:

PR c++/115624
* c.opt (Wnrvo): Add Warning property.

Signed-off-by: Andrew Pinski 
(cherry picked from commit f7747210947a7c66e865c6ac571cce39e2b87caf)

[Bug c++/115624] '-Wnrvo' is not an option that controls warnings

2024-06-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115624

--- Comment #2 from GCC Commits  ---
The trunk branch has been updated by Andrew Pinski :

https://gcc.gnu.org/g:f7747210947a7c66e865c6ac571cce39e2b87caf

commit r15-1590-gf7747210947a7c66e865c6ac571cce39e2b87caf
Author: Andrew Pinski 
Date:   Mon Jun 24 18:16:13 2024 -0700

c-family: Add Warning property to Wnrvo option [PR115624]

This was missing when Wnrvo was added in
r14-1594-g2ae5384d457b9c67586de012816dfc71a6943164 .

Pushed after a bootstrap/test on x86_64-linux-gnu.

gcc/c-family/ChangeLog:

PR c++/115624
* c.opt (Wnrvo): Add Warning property.

Signed-off-by: Andrew Pinski

[Bug fortran/55978] class_optional_2.f90 -Os fails

2024-06-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55978

--- Comment #34 from GCC Commits  ---
The master branch has been updated by Harald Anlauf :

https://gcc.gnu.org/g:f02c70dafd384f0c44d7a0920f4a75a30e267045

commit r15-1585-gf02c70dafd384f0c44d7a0920f4a75a30e267045
Author: Harald Anlauf 
Date:   Sun Jun 23 22:36:43 2024 +0200

Fortran: fix passing of optional dummy as actual to optional argument
[PR55978]

gcc/fortran/ChangeLog:

PR fortran/55978
* trans-array.cc (gfc_conv_array_parameter): Do not dereference
data component of a missing allocatable dummy array argument for
passing as actual to optional dummy.  Harden logic of presence
check for optional pointer dummy by using TRUTH_ANDIF_EXPR instead
of TRUTH_AND_EXPR.

gcc/testsuite/ChangeLog:

PR fortran/55978
* gfortran.dg/optional_absent_12.f90: New test.

[Bug tree-optimization/113673] [12/13/14/15 Regression] ICE: verify_flow_info failed: BB 5 cannot throw but has an EH edge with -Os -finstrument-functions -fnon-call-exceptions -ftrapv

2024-06-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113673

--- Comment #8 from GCC Commits  ---
The master branch has been updated by Roger Sayle :

https://gcc.gnu.org/g:d8b05aef77443e1d3d8f3f5d2c56ac49a503fee3

commit r15-1584-gd8b05aef77443e1d3d8f3f5d2c56ac49a503fee3
Author: Roger Sayle 
Date:   Mon Jun 24 15:34:03 2024 +0100

PR tree-optimization/113673: Avoid load merging when potentially trapping.

This patch fixes PR tree-optimization/113673, a P2 ice-on-valid regression
caused by load merging of (ptr[0]<<8)+ptr[1] when -ftrapv has been
specified.  When the operator is | or ^ this is safe, but for addition
of signed integer types, a trap may be generated/required, so merging this
idiom into a single non-trapping instruction is inappropriate, confusing
the compiler by transforming a basic block with an exception edge into one
without.

This revision implements Richard Biener's feedback to add an early check
for stmt_can_throw_internal (cfun, stmt) to prevent transforming in the
presence of any statement that could trap, not just overflow on addition.
The one other tweak included in this patch is to mark the local function
find_bswap_or_nop_load as static ensuring that it isn't called from outside
this file, and guaranteeing that it is dominated by stmt_can_throw_internal
checking.

2024-06-24  Roger Sayle  
Richard Biener  

gcc/ChangeLog
PR tree-optimization/113673
* gimple-ssa-store-merging.cc (find_bswap_or_nop_load): Make
static.
(find_bswap_or_nop_1): Avoid transformations (load merging) when
stmt_can_throw_internal indicates that a statement can trap.

gcc/testsuite/ChangeLog
PR tree-optimization/113673
* g++.dg/pr113673.C: New test case.

[Bug tree-optimization/115602] [15 Regression] ICE on liblapack-3.12.0: in vect_schedule_slp_node, at tree-vect-slp.cc:9643 since r15-1565-g2a345214fc332b

2024-06-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115602

--- Comment #9 from GCC Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:c43c74f6ec795a586388de7abfdd20a0040f6f16

commit r15-1583-gc43c74f6ec795a586388de7abfdd20a0040f6f16
Author: Richard Biener 
Date:   Mon Jun 24 09:52:39 2024 +0200

tree-optimization/115602 - SLP CSE results in cycles

The following prevents SLP CSE to create new cycles which happened
because of a 1:1 permute node being present where its child was then
CSEd to the permute node.  Fixed by making a node only available to
CSE to after recursing.

PR tree-optimization/115602
* tree-vect-slp.cc (vect_cse_slp_nodes): Delay populating the
bst-map to avoid cycles.

* gcc.dg/vect/pr115602.c: New testcase.

[Bug middle-end/115528] [15 regression] segmentation fault in legacy F77 code since r15-1238-g1fe55a1794863b

2024-06-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115528

--- Comment #30 from GCC Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:2f83ea87ee328d337f87d4430861221be9babe1e

commit r15-1582-g2f83ea87ee328d337f87d4430861221be9babe1e
Author: Richard Biener 
Date:   Fri Jun 21 13:19:26 2024 +0200

tree-optimization/115528 - fix vect alignment analysis for outer loop vect

For outer loop vectorization of a data reference in the inner loop
we have to look at both steps to see if they preserve alignment.

What is special for this testcase is that the outer loop step is
one element but the inner loop step four and that we now use SLP
and the vectorization factor is one.

PR tree-optimization/115528
* tree-vect-data-refs.cc (vect_compute_data_ref_alignment):
Make sure to look at both the inner and outer loop step
behavior.

* gfortran.dg/vect/pr115528.f: New testcase.

[Bug rtl-optimization/114996] [15 Regression] [RISC-V] 2->2 combination no longer occurring

2024-06-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114996

--- Comment #7 from GCC Commits  ---
The trunk branch has been updated by Richard Sandiford :

https://gcc.gnu.org/g:792f97b44ffc5e6a967292b3747fd835e99396e7

commit r15-1579-g792f97b44ffc5e6a967292b3747fd835e99396e7
Author: Richard Sandiford 
Date:   Mon Jun 24 08:43:19 2024 +0100

Add a late-combine pass [PR106594]

This patch adds a combine pass that runs late in the pipeline.
There are two instances: one between combine and split1, and one
after postreload.

The pass currently has a single objective: remove definitions by
substituting into all uses.  The pre-RA version tries to restrict
itself to cases that are likely to have a neutral or beneficial
effect on register pressure.

The patch fixes PR106594.  It also fixes a few FAILs and XFAILs
in the aarch64 test results, mostly due to making proper use of
MOVPRFX in cases where we didn't previously.

This is just a first step.  I'm hoping that the pass could be
used for other combine-related optimisations in future.  In particular,
the post-RA version doesn't need to restrict itself to cases where all
uses are substitutable, since it doesn't have to worry about register
pressure.  If we did that, and if we extended it to handle multi-register
REGs, the pass might be a viable replacement for regcprop, which in
turn might reduce the cost of having a post-RA instance of the new pass.

On most targets, the pass is enabled by default at -O2 and above.
However, it has a tendency to undo x86's STV and RPAD passes,
by folding the more complex post-STV/RPAD form back into the
simpler pre-pass form.

Also, running a pass after register allocation means that we can
now match define_insn_and_splits that were previously only matched
before register allocation.  This trips things like:

  (define_insn_and_split "..."
[...pattern...]
"...cond..."
"#"
"&& 1"
[...pattern...]
{
  ...unconditional use of gen_reg_rtx ()...;
}

because matching and splitting after RA will call gen_reg_rtx when
pseudos are no longer allowed.  rs6000 has several instances of this.

xtensa has a variation in which the split condition is:

"&& can_create_pseudo_p ()"

The failure then is that, if we match after RA, we'll never be
able to split the instruction.

The patch therefore disables the pass by default on i386, rs6000
and xtensa.  Hopefully we can fix those ports later (if their
maintainers want).  It seems better to add the pass first, though,
to make it easier to test any such fixes.

gcc.target/aarch64/bitfield-bitint-abi-align{16,8}.c would need
quite a few updates for the late-combine output.  That might be
worth doing, but it seems too complex to do as part of this patch.

I tried compiling at least one target per CPU directory and comparing
the assembly output for parts of the GCC testsuite.  This is just a way
of getting a flavour of how the pass performs; it obviously isn't a
meaningful benchmark.  All targets seemed to improve on average:

Target Tests   GoodBad   %Good   Delta  Median
== =   ===   =   =  ==
aarch64-linux-gnu   2215   1975240  89.16%   -4159  -1
aarch64_be-linux-gnu1569   1483 86  94.52%  -10117  -1
alpha-linux-gnu 1454   1370 84  94.22%   -9502  -1
amdgcn-amdhsa   5122   4671451  91.19%  -35737  -1
arc-elf 2166   1932234  89.20%  -37742  -1
arm-linux-gnueabi   1953   1661292  85.05%  -12415  -1
arm-linux-gnueabihf 1834   1549285  84.46%  -11137  -1
avr-elf 4789   4330459  90.42% -441276  -4
bfin-elf2795   2394401  85.65%  -19252  -1
bpf-elf 3122   2928194  93.79%   -8785  -1
c6x-elf 2227   1929298  86.62%  -17339  -1
cris-elf3464   3270194  94.40%  -23263  -2
csky-elf2915   2591324  88.89%  -22146  -1
epiphany-elf2399   2304 95  96.04%  -28698  -2
fr30-elf7712   7299413  94.64%  -99830  -2
frv-linux-gnu   3332   2877455  86.34%  -25108  -1
ft32-elf2775   2667108  96.11%  -25029  -1
h8300-elf   3176   2862314  90.11%  -29305  -2
hppa64-hp-hpux11.23 4287   4247 40  99.07%  -45963  -2
ia64-linux-gnu  2343   1946397  83.06%   -9907  -2
iq2000-elf  9684   9637 47  99.51% -126557  -2
lm32-elf2681   2608 73  97.28%  -59884  -3
loongarch64-linux-gnu   1303   1218 85  93.48%  -13375  -2

[Bug rtl-optimization/115104] [15 Regression] RISC-V: GCC-14 can combine vsext+vadd -> vwadd but Trunk GCC (GCC 15) Failed

2024-06-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115104

--- Comment #6 from GCC Commits  ---
The trunk branch has been updated by Richard Sandiford :

https://gcc.gnu.org/g:792f97b44ffc5e6a967292b3747fd835e99396e7

commit r15-1579-g792f97b44ffc5e6a967292b3747fd835e99396e7
Author: Richard Sandiford 
Date:   Mon Jun 24 08:43:19 2024 +0100

Add a late-combine pass [PR106594]

This patch adds a combine pass that runs late in the pipeline.
There are two instances: one between combine and split1, and one
after postreload.

The pass currently has a single objective: remove definitions by
substituting into all uses.  The pre-RA version tries to restrict
itself to cases that are likely to have a neutral or beneficial
effect on register pressure.

The patch fixes PR106594.  It also fixes a few FAILs and XFAILs
in the aarch64 test results, mostly due to making proper use of
MOVPRFX in cases where we didn't previously.

This is just a first step.  I'm hoping that the pass could be
used for other combine-related optimisations in future.  In particular,
the post-RA version doesn't need to restrict itself to cases where all
uses are substitutable, since it doesn't have to worry about register
pressure.  If we did that, and if we extended it to handle multi-register
REGs, the pass might be a viable replacement for regcprop, which in
turn might reduce the cost of having a post-RA instance of the new pass.

On most targets, the pass is enabled by default at -O2 and above.
However, it has a tendency to undo x86's STV and RPAD passes,
by folding the more complex post-STV/RPAD form back into the
simpler pre-pass form.

Also, running a pass after register allocation means that we can
now match define_insn_and_splits that were previously only matched
before register allocation.  This trips things like:

  (define_insn_and_split "..."
[...pattern...]
"...cond..."
"#"
"&& 1"
[...pattern...]
{
  ...unconditional use of gen_reg_rtx ()...;
}

because matching and splitting after RA will call gen_reg_rtx when
pseudos are no longer allowed.  rs6000 has several instances of this.

xtensa has a variation in which the split condition is:

"&& can_create_pseudo_p ()"

The failure then is that, if we match after RA, we'll never be
able to split the instruction.

The patch therefore disables the pass by default on i386, rs6000
and xtensa.  Hopefully we can fix those ports later (if their
maintainers want).  It seems better to add the pass first, though,
to make it easier to test any such fixes.

gcc.target/aarch64/bitfield-bitint-abi-align{16,8}.c would need
quite a few updates for the late-combine output.  That might be
worth doing, but it seems too complex to do as part of this patch.

I tried compiling at least one target per CPU directory and comparing
the assembly output for parts of the GCC testsuite.  This is just a way
of getting a flavour of how the pass performs; it obviously isn't a
meaningful benchmark.  All targets seemed to improve on average:

Target Tests   GoodBad   %Good   Delta  Median
== =   ===   =   =  ==
aarch64-linux-gnu   2215   1975240  89.16%   -4159  -1
aarch64_be-linux-gnu1569   1483 86  94.52%  -10117  -1
alpha-linux-gnu 1454   1370 84  94.22%   -9502  -1
amdgcn-amdhsa   5122   4671451  91.19%  -35737  -1
arc-elf 2166   1932234  89.20%  -37742  -1
arm-linux-gnueabi   1953   1661292  85.05%  -12415  -1
arm-linux-gnueabihf 1834   1549285  84.46%  -11137  -1
avr-elf 4789   4330459  90.42% -441276  -4
bfin-elf2795   2394401  85.65%  -19252  -1
bpf-elf 3122   2928194  93.79%   -8785  -1
c6x-elf 2227   1929298  86.62%  -17339  -1
cris-elf3464   3270194  94.40%  -23263  -2
csky-elf2915   2591324  88.89%  -22146  -1
epiphany-elf2399   2304 95  96.04%  -28698  -2
fr30-elf7712   7299413  94.64%  -99830  -2
frv-linux-gnu   3332   2877455  86.34%  -25108  -1
ft32-elf2775   2667108  96.11%  -25029  -1
h8300-elf   3176   2862314  90.11%  -29305  -2
hppa64-hp-hpux11.23 4287   4247 40  99.07%  -45963  -2
ia64-linux-gnu  2343   1946397  83.06%   -9907  -2
iq2000-elf  9684   9637 47  99.51% -126557  -2
lm32-elf2681   2608 73  97.28%  -59884  -3
loongarch64-linux-gnu   1303   1218 85  93.48%  -13375  -2

[Bug rtl-optimization/114515] [15 Regression] Failure to use aarch64 lane forms after PR101523

2024-06-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114515

--- Comment #13 from GCC Commits  ---
The trunk branch has been updated by Richard Sandiford :

https://gcc.gnu.org/g:792f97b44ffc5e6a967292b3747fd835e99396e7

commit r15-1579-g792f97b44ffc5e6a967292b3747fd835e99396e7
Author: Richard Sandiford 
Date:   Mon Jun 24 08:43:19 2024 +0100

Add a late-combine pass [PR106594]

This patch adds a combine pass that runs late in the pipeline.
There are two instances: one between combine and split1, and one
after postreload.

The pass currently has a single objective: remove definitions by
substituting into all uses.  The pre-RA version tries to restrict
itself to cases that are likely to have a neutral or beneficial
effect on register pressure.

The patch fixes PR106594.  It also fixes a few FAILs and XFAILs
in the aarch64 test results, mostly due to making proper use of
MOVPRFX in cases where we didn't previously.

This is just a first step.  I'm hoping that the pass could be
used for other combine-related optimisations in future.  In particular,
the post-RA version doesn't need to restrict itself to cases where all
uses are substitutable, since it doesn't have to worry about register
pressure.  If we did that, and if we extended it to handle multi-register
REGs, the pass might be a viable replacement for regcprop, which in
turn might reduce the cost of having a post-RA instance of the new pass.

On most targets, the pass is enabled by default at -O2 and above.
However, it has a tendency to undo x86's STV and RPAD passes,
by folding the more complex post-STV/RPAD form back into the
simpler pre-pass form.

Also, running a pass after register allocation means that we can
now match define_insn_and_splits that were previously only matched
before register allocation.  This trips things like:

  (define_insn_and_split "..."
[...pattern...]
"...cond..."
"#"
"&& 1"
[...pattern...]
{
  ...unconditional use of gen_reg_rtx ()...;
}

because matching and splitting after RA will call gen_reg_rtx when
pseudos are no longer allowed.  rs6000 has several instances of this.

xtensa has a variation in which the split condition is:

"&& can_create_pseudo_p ()"

The failure then is that, if we match after RA, we'll never be
able to split the instruction.

The patch therefore disables the pass by default on i386, rs6000
and xtensa.  Hopefully we can fix those ports later (if their
maintainers want).  It seems better to add the pass first, though,
to make it easier to test any such fixes.

gcc.target/aarch64/bitfield-bitint-abi-align{16,8}.c would need
quite a few updates for the late-combine output.  That might be
worth doing, but it seems too complex to do as part of this patch.

I tried compiling at least one target per CPU directory and comparing
the assembly output for parts of the GCC testsuite.  This is just a way
of getting a flavour of how the pass performs; it obviously isn't a
meaningful benchmark.  All targets seemed to improve on average:

Target Tests   GoodBad   %Good   Delta  Median
== =   ===   =   =  ==
aarch64-linux-gnu   2215   1975240  89.16%   -4159  -1
aarch64_be-linux-gnu1569   1483 86  94.52%  -10117  -1
alpha-linux-gnu 1454   1370 84  94.22%   -9502  -1
amdgcn-amdhsa   5122   4671451  91.19%  -35737  -1
arc-elf 2166   1932234  89.20%  -37742  -1
arm-linux-gnueabi   1953   1661292  85.05%  -12415  -1
arm-linux-gnueabihf 1834   1549285  84.46%  -11137  -1
avr-elf 4789   4330459  90.42% -441276  -4
bfin-elf2795   2394401  85.65%  -19252  -1
bpf-elf 3122   2928194  93.79%   -8785  -1
c6x-elf 2227   1929298  86.62%  -17339  -1
cris-elf3464   3270194  94.40%  -23263  -2
csky-elf2915   2591324  88.89%  -22146  -1
epiphany-elf2399   2304 95  96.04%  -28698  -2
fr30-elf7712   7299413  94.64%  -99830  -2
frv-linux-gnu   3332   2877455  86.34%  -25108  -1
ft32-elf2775   2667108  96.11%  -25029  -1
h8300-elf   3176   2862314  90.11%  -29305  -2
hppa64-hp-hpux11.23 4287   4247 40  99.07%  -45963  -2
ia64-linux-gnu  2343   1946397  83.06%   -9907  -2
iq2000-elf  9684   9637 47  99.51% -126557  -2
lm32-elf2681   2608 73  97.28%  -59884  -3
loongarch64-linux-gnu   1303   1218 85  93.48%  -13375  -2

[Bug rtl-optimization/114575] [15 Regression] SVE addressing modes broken since g:839bc42772ba7af66af3bd16efed4a69511312ae

2024-06-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114575

--- Comment #4 from GCC Commits  ---
The trunk branch has been updated by Richard Sandiford :

https://gcc.gnu.org/g:792f97b44ffc5e6a967292b3747fd835e99396e7

commit r15-1579-g792f97b44ffc5e6a967292b3747fd835e99396e7
Author: Richard Sandiford 
Date:   Mon Jun 24 08:43:19 2024 +0100

Add a late-combine pass [PR106594]

This patch adds a combine pass that runs late in the pipeline.
There are two instances: one between combine and split1, and one
after postreload.

The pass currently has a single objective: remove definitions by
substituting into all uses.  The pre-RA version tries to restrict
itself to cases that are likely to have a neutral or beneficial
effect on register pressure.

The patch fixes PR106594.  It also fixes a few FAILs and XFAILs
in the aarch64 test results, mostly due to making proper use of
MOVPRFX in cases where we didn't previously.

This is just a first step.  I'm hoping that the pass could be
used for other combine-related optimisations in future.  In particular,
the post-RA version doesn't need to restrict itself to cases where all
uses are substitutable, since it doesn't have to worry about register
pressure.  If we did that, and if we extended it to handle multi-register
REGs, the pass might be a viable replacement for regcprop, which in
turn might reduce the cost of having a post-RA instance of the new pass.

On most targets, the pass is enabled by default at -O2 and above.
However, it has a tendency to undo x86's STV and RPAD passes,
by folding the more complex post-STV/RPAD form back into the
simpler pre-pass form.

Also, running a pass after register allocation means that we can
now match define_insn_and_splits that were previously only matched
before register allocation.  This trips things like:

  (define_insn_and_split "..."
[...pattern...]
"...cond..."
"#"
"&& 1"
[...pattern...]
{
  ...unconditional use of gen_reg_rtx ()...;
}

because matching and splitting after RA will call gen_reg_rtx when
pseudos are no longer allowed.  rs6000 has several instances of this.

xtensa has a variation in which the split condition is:

"&& can_create_pseudo_p ()"

The failure then is that, if we match after RA, we'll never be
able to split the instruction.

The patch therefore disables the pass by default on i386, rs6000
and xtensa.  Hopefully we can fix those ports later (if their
maintainers want).  It seems better to add the pass first, though,
to make it easier to test any such fixes.

gcc.target/aarch64/bitfield-bitint-abi-align{16,8}.c would need
quite a few updates for the late-combine output.  That might be
worth doing, but it seems too complex to do as part of this patch.

I tried compiling at least one target per CPU directory and comparing
the assembly output for parts of the GCC testsuite.  This is just a way
of getting a flavour of how the pass performs; it obviously isn't a
meaningful benchmark.  All targets seemed to improve on average:

Target Tests   GoodBad   %Good   Delta  Median
== =   ===   =   =  ==
aarch64-linux-gnu   2215   1975240  89.16%   -4159  -1
aarch64_be-linux-gnu1569   1483 86  94.52%  -10117  -1
alpha-linux-gnu 1454   1370 84  94.22%   -9502  -1
amdgcn-amdhsa   5122   4671451  91.19%  -35737  -1
arc-elf 2166   1932234  89.20%  -37742  -1
arm-linux-gnueabi   1953   1661292  85.05%  -12415  -1
arm-linux-gnueabihf 1834   1549285  84.46%  -11137  -1
avr-elf 4789   4330459  90.42% -441276  -4
bfin-elf2795   2394401  85.65%  -19252  -1
bpf-elf 3122   2928194  93.79%   -8785  -1
c6x-elf 2227   1929298  86.62%  -17339  -1
cris-elf3464   3270194  94.40%  -23263  -2
csky-elf2915   2591324  88.89%  -22146  -1
epiphany-elf2399   2304 95  96.04%  -28698  -2
fr30-elf7712   7299413  94.64%  -99830  -2
frv-linux-gnu   3332   2877455  86.34%  -25108  -1
ft32-elf2775   2667108  96.11%  -25029  -1
h8300-elf   3176   2862314  90.11%  -29305  -2
hppa64-hp-hpux11.23 4287   4247 40  99.07%  -45963  -2
ia64-linux-gnu  2343   1946397  83.06%   -9907  -2
iq2000-elf  9684   9637 47  99.51% -126557  -2
lm32-elf2681   2608 73  97.28%  -59884  -3
loongarch64-linux-gnu   1303   1218 85  93.48%  -13375  -2

[Bug rtl-optimization/106594] [13/14/15 Regression] sign-extensions no longer merged into addressing mode

2024-06-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106594

--- Comment #29 from GCC Commits  ---
The trunk branch has been updated by Richard Sandiford :

https://gcc.gnu.org/g:792f97b44ffc5e6a967292b3747fd835e99396e7

commit r15-1579-g792f97b44ffc5e6a967292b3747fd835e99396e7
Author: Richard Sandiford 
Date:   Mon Jun 24 08:43:19 2024 +0100

Add a late-combine pass [PR106594]

This patch adds a combine pass that runs late in the pipeline.
There are two instances: one between combine and split1, and one
after postreload.

The pass currently has a single objective: remove definitions by
substituting into all uses.  The pre-RA version tries to restrict
itself to cases that are likely to have a neutral or beneficial
effect on register pressure.

The patch fixes PR106594.  It also fixes a few FAILs and XFAILs
in the aarch64 test results, mostly due to making proper use of
MOVPRFX in cases where we didn't previously.

This is just a first step.  I'm hoping that the pass could be
used for other combine-related optimisations in future.  In particular,
the post-RA version doesn't need to restrict itself to cases where all
uses are substitutable, since it doesn't have to worry about register
pressure.  If we did that, and if we extended it to handle multi-register
REGs, the pass might be a viable replacement for regcprop, which in
turn might reduce the cost of having a post-RA instance of the new pass.

On most targets, the pass is enabled by default at -O2 and above.
However, it has a tendency to undo x86's STV and RPAD passes,
by folding the more complex post-STV/RPAD form back into the
simpler pre-pass form.

Also, running a pass after register allocation means that we can
now match define_insn_and_splits that were previously only matched
before register allocation.  This trips things like:

  (define_insn_and_split "..."
[...pattern...]
"...cond..."
"#"
"&& 1"
[...pattern...]
{
  ...unconditional use of gen_reg_rtx ()...;
}

because matching and splitting after RA will call gen_reg_rtx when
pseudos are no longer allowed.  rs6000 has several instances of this.

xtensa has a variation in which the split condition is:

"&& can_create_pseudo_p ()"

The failure then is that, if we match after RA, we'll never be
able to split the instruction.

The patch therefore disables the pass by default on i386, rs6000
and xtensa.  Hopefully we can fix those ports later (if their
maintainers want).  It seems better to add the pass first, though,
to make it easier to test any such fixes.

gcc.target/aarch64/bitfield-bitint-abi-align{16,8}.c would need
quite a few updates for the late-combine output.  That might be
worth doing, but it seems too complex to do as part of this patch.

I tried compiling at least one target per CPU directory and comparing
the assembly output for parts of the GCC testsuite.  This is just a way
of getting a flavour of how the pass performs; it obviously isn't a
meaningful benchmark.  All targets seemed to improve on average:

Target Tests   GoodBad   %Good   Delta  Median
== =   ===   =   =  ==
aarch64-linux-gnu   2215   1975240  89.16%   -4159  -1
aarch64_be-linux-gnu1569   1483 86  94.52%  -10117  -1
alpha-linux-gnu 1454   1370 84  94.22%   -9502  -1
amdgcn-amdhsa   5122   4671451  91.19%  -35737  -1
arc-elf 2166   1932234  89.20%  -37742  -1
arm-linux-gnueabi   1953   1661292  85.05%  -12415  -1
arm-linux-gnueabihf 1834   1549285  84.46%  -11137  -1
avr-elf 4789   4330459  90.42% -441276  -4
bfin-elf2795   2394401  85.65%  -19252  -1
bpf-elf 3122   2928194  93.79%   -8785  -1
c6x-elf 2227   1929298  86.62%  -17339  -1
cris-elf3464   3270194  94.40%  -23263  -2
csky-elf2915   2591324  88.89%  -22146  -1
epiphany-elf2399   2304 95  96.04%  -28698  -2
fr30-elf7712   7299413  94.64%  -99830  -2
frv-linux-gnu   3332   2877455  86.34%  -25108  -1
ft32-elf2775   2667108  96.11%  -25029  -1
h8300-elf   3176   2862314  90.11%  -29305  -2
hppa64-hp-hpux11.23 4287   4247 40  99.07%  -45963  -2
ia64-linux-gnu  2343   1946397  83.06%   -9907  -2
iq2000-elf  9684   9637 47  99.51% -126557  -2
lm32-elf2681   2608 73  97.28%  -59884  -3
loongarch64-linux-gnu   1303   1218 85  93.48%  -13375  -2

[Bug tree-optimization/115599] ICE: qsort checking failed during GIMPLE pass: reassoc (error: qsort comparator non-negative on sorted output: 150142972)

2024-06-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115599

--- Comment #5 from GCC Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:ae13af26060eb686418ea9c9d455cd665049402d

commit r15-1577-gae13af26060eb686418ea9c9d455cd665049402d
Author: Richard Biener 
Date:   Sun Jun 23 14:37:53 2024 +0200

tree-optimization/115599 - reassoc qsort comparator issue

The compare_repeat_factors comparator fails qsort checking eventually
because it uses rf2->rank - rf1->rank to compare unsigned numbers
which causes issues for ranks that interpret negative as signed.

Fixed by re-writing the obvious way.  I've also fixed the count
comparison which suffers from truncation as count is 64bit signed
while the comparator result is 32bit int (that's a lot less likely
to hit in practice though).

The testcase from the PR is too large to include.

PR tree-optimization/115599
* tree-ssa-reassoc.cc (compare_repeat_factors): Use explicit
compares to avoid truncations.

[Bug target/113325] unnecessary byte swap for memory clear

2024-06-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113325

--- Comment #2 from GCC Commits  ---
The master branch has been updated by HaoChen Gui :

https://gcc.gnu.org/g:6274f10318d05311f31147c895f76a01aec37830

commit r15-1576-g6274f10318d05311f31147c895f76a01aec37830
Author: Haochen Gui 
Date:   Mon Jun 24 13:16:12 2024 +0800

rs6000: Eliminate unnecessary byte swaps for duplicated constant vector
store

gcc/
PR target/113325
* config/rs6000/vsx.md (vsx_stxvd2x4_le_const_): New.

gcc/testsuite/
PR target/113325
* gcc.target/powerpc/pr113325.c: New.

[Bug target/114846] powerpc: epilogue in _Unwind_RaiseException corrupts return value due to __builtin_eh_return

2024-06-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114846

--- Comment #16 from GCC Commits  ---
The releases/gcc-11 branch has been updated by Kewen Lin :

https://gcc.gnu.org/g:549701628b64a7c4ac9bb5f9623e83a8dc1d828c

commit r11-11535-g549701628b64a7c4ac9bb5f9623e83a8dc1d828c
Author: Kewen Lin 
Date:   Tue May 28 21:13:40 2024 -0500

rs6000: Don't clobber return value when eh_return called [PR114846]

As the associated test case in PR114846 shows, currently
with eh_return involved some register restoring for EH
RETURN DATA in epilogue can clobber the one which holding
the return value.  Referring to the existing handlings in
some other targets, this patch makes eh_return expander
call one new define_insn_and_split eh_return_internal which
directly calls rs6000_emit_epilogue with epilogue_type
EPILOGUE_TYPE_EH_RETURN instead of the previous treating
normal return with crtl->calls_eh_return specially.

PR target/114846

gcc/ChangeLog:

* config/rs6000/rs6000-logue.c (rs6000_emit_epilogue): As
EPILOGUE_TYPE_EH_RETURN would be passed as epilogue_type directly
now, adjust the relevant handlings on it.
* config/rs6000/rs6000.md (eh_return expander): Append by calling
gen_eh_return_internal and emit_barrier.
(eh_return_internal): New define_insn_and_split, call function
rs6000_emit_epilogue with epilogue type EPILOGUE_TYPE_EH_RETURN.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr114846.c: New test.

(cherry picked from commit e5fc5d42d25c86ae48178db04ce64d340a834614)

[Bug target/114846] powerpc: epilogue in _Unwind_RaiseException corrupts return value due to __builtin_eh_return

2024-06-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114846

--- Comment #15 from GCC Commits  ---
The releases/gcc-12 branch has been updated by Kewen Lin :

https://gcc.gnu.org/g:0fd6ae9b20913ab84d596448e14411eedbd324f9

commit r12-10579-g0fd6ae9b20913ab84d596448e14411eedbd324f9
Author: Kewen Lin 
Date:   Tue May 28 21:13:40 2024 -0500

rs6000: Don't clobber return value when eh_return called [PR114846]

As the associated test case in PR114846 shows, currently
with eh_return involved some register restoring for EH
RETURN DATA in epilogue can clobber the one which holding
the return value.  Referring to the existing handlings in
some other targets, this patch makes eh_return expander
call one new define_insn_and_split eh_return_internal which
directly calls rs6000_emit_epilogue with epilogue_type
EPILOGUE_TYPE_EH_RETURN instead of the previous treating
normal return with crtl->calls_eh_return specially.

PR target/114846

gcc/ChangeLog:

* config/rs6000/rs6000-logue.cc (rs6000_emit_epilogue): As
EPILOGUE_TYPE_EH_RETURN would be passed as epilogue_type directly
now, adjust the relevant handlings on it.
* config/rs6000/rs6000.md (eh_return expander): Append by calling
gen_eh_return_internal and emit_barrier.
(eh_return_internal): New define_insn_and_split, call function
rs6000_emit_epilogue with epilogue type EPILOGUE_TYPE_EH_RETURN.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr114846.c: New test.

(cherry picked from commit e5fc5d42d25c86ae48178db04ce64d340a834614)

[Bug target/114846] powerpc: epilogue in _Unwind_RaiseException corrupts return value due to __builtin_eh_return

2024-06-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114846

--- Comment #14 from GCC Commits  ---
The releases/gcc-13 branch has been updated by Kewen Lin :

https://gcc.gnu.org/g:dd54ed4ae417935300a3c4bb356d37c2ae7f731e

commit r13-8866-gdd54ed4ae417935300a3c4bb356d37c2ae7f731e
Author: Kewen Lin 
Date:   Tue May 28 21:13:40 2024 -0500

rs6000: Don't clobber return value when eh_return called [PR114846]

As the associated test case in PR114846 shows, currently
with eh_return involved some register restoring for EH
RETURN DATA in epilogue can clobber the one which holding
the return value.  Referring to the existing handlings in
some other targets, this patch makes eh_return expander
call one new define_insn_and_split eh_return_internal which
directly calls rs6000_emit_epilogue with epilogue_type
EPILOGUE_TYPE_EH_RETURN instead of the previous treating
normal return with crtl->calls_eh_return specially.

PR target/114846

gcc/ChangeLog:

* config/rs6000/rs6000-logue.cc (rs6000_emit_epilogue): As
EPILOGUE_TYPE_EH_RETURN would be passed as epilogue_type directly
now, adjust the relevant handlings on it.
* config/rs6000/rs6000.md (eh_return expander): Append by calling
gen_eh_return_internal and emit_barrier.
(eh_return_internal): New define_insn_and_split, call function
rs6000_emit_epilogue with epilogue type EPILOGUE_TYPE_EH_RETURN.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr114846.c: New test.

(cherry picked from commit e5fc5d42d25c86ae48178db04ce64d340a834614)

[Bug target/114846] powerpc: epilogue in _Unwind_RaiseException corrupts return value due to __builtin_eh_return

2024-06-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114846

--- Comment #13 from GCC Commits  ---
The releases/gcc-14 branch has been updated by Kewen Lin :

https://gcc.gnu.org/g:2b5e8f918ef0027d2af8e53c4e114e1d133fc609

commit r14-10342-g2b5e8f918ef0027d2af8e53c4e114e1d133fc609
Author: Kewen Lin 
Date:   Tue May 28 21:13:40 2024 -0500

rs6000: Don't clobber return value when eh_return called [PR114846]

As the associated test case in PR114846 shows, currently
with eh_return involved some register restoring for EH
RETURN DATA in epilogue can clobber the one which holding
the return value.  Referring to the existing handlings in
some other targets, this patch makes eh_return expander
call one new define_insn_and_split eh_return_internal which
directly calls rs6000_emit_epilogue with epilogue_type
EPILOGUE_TYPE_EH_RETURN instead of the previous treating
normal return with crtl->calls_eh_return specially.

PR target/114846

gcc/ChangeLog:

* config/rs6000/rs6000-logue.cc (rs6000_emit_epilogue): As
EPILOGUE_TYPE_EH_RETURN would be passed as epilogue_type directly
now, adjust the relevant handlings on it.
* config/rs6000/rs6000.md (eh_return expander): Append by calling
gen_eh_return_internal and emit_barrier.
(eh_return_internal): New define_insn_and_split, call function
rs6000_emit_epilogue with epilogue type EPILOGUE_TYPE_EH_RETURN.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr114846.c: New test.

(cherry picked from commit e5fc5d42d25c86ae48178db04ce64d340a834614)

[Bug target/114139] ICE: RTL check: expected code 'const_int', have 'reg' in riscv_macro_fusion_pair_p, at config/riscv/riscv.cc:8438 with -O2 -fpic -mexplicit-relocs -mcpu=sifive-p450

2024-06-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114139

--- Comment #1 from GCC Commits  ---
The master branch has been updated by Jeff Law :

https://gcc.gnu.org/g:fd536b8412d4dae42aa04739c06f99a915be6261

commit r15-1566-gfd536b8412d4dae42aa04739c06f99a915be6261
Author: Jeff Law 
Date:   Sun Jun 23 08:26:25 2024 -0600

[committed][RISC-V][PR target/114139] Verify we have a CONST_INT before
extracting INTVAL

Run-of-the-mill checking issue.  We had something like (plus (reg) (reg))
and
tried to extract INTVAL (XEXP (x, 1)) which of course blows up with
checking
on.

Fixed thusly.   Tested on riscv32-elf in my tester.  riscv64-elf is in
flight,
but won't finish for a while due to other tasks in flight.

PR target/114139
gcc/
* config/riscv/riscv.cc (riscv_macro_fusion_pair_p): Verify object
is a CONST_INT before looking at INTVAL.

gcc/testsuite/

* gcc.target/riscv/pr114139.c: New test.

[Bug middle-end/115597] [15 Regression] vectorizer takes 20+ h compiling 510.parest in SPECCPU2017 since g:46bb4ce4d30ab749d40f6f4cef6f1fb7c7813452

2024-06-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115597

--- Comment #5 from GCC Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:2a345214fc332b6f0821edf394ff8802b768db1d

commit r15-1565-g2a345214fc332b6f0821edf394ff8802b768db1d
Author: Richard Biener 
Date:   Sun Jun 23 11:26:39 2024 +0200

tree-optimization/115597 - allow CSE of two-operator VEC_PERM nodes

The following makes sure to always CSE when there's SLP_TREE_SCALAR_STMTS
as otherwise a chain of two-operator node operations can result in
exponential behavior of the CSE process as likely seen when building
510.parest on aarch64.

PR tree-optimization/115597
* tree-vect-slp.cc (vect_cse_slp_nodes): Allow to CSE
VEC_PERM nodes.

[Bug tree-optimization/115579] [15 regression] wrong code at -Os with "-fno-tree-sra" on x86_64-linux-gnu since r15-1391-g4b75ed33fa5fd6

2024-06-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115579

--- Comment #9 from GCC Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:8a1795bddcd34284936af4706f762d89c60fc69c

commit r15-1564-g8a1795bddcd34284936af4706f762d89c60fc69c
Author: Richard Biener 
Date:   Sat Jun 22 14:59:09 2024 +0200

tree-optimization/115579 - fix wrong code with store-motion

The recent change to relax store motion for variables that cannot have
store data races broke the optimization to share flag vars for stores
that all happen in the same single BB.  The following fixes this.

PR tree-optimization/115579
* tree-ssa-loop-im.cc (execute_sm): Return the auxiliary data
created.
(hoist_memory_references): Record the flag var that's eventually
created and re-use it when all stores are in the same BB.

* gcc.dg/pr115579.c: New testcase.

[Bug target/115409] avx512 intrinsics trigger -Wshift-overflow

2024-06-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115409

--- Comment #5 from GCC Commits  ---
The master branch has been updated by hongtao Liu :

https://gcc.gnu.org/g:4c957d7ba84d8bbce6e778048f38e92ef71806c8

commit r15-1563-g4c957d7ba84d8bbce6e778048f38e92ef71806c8
Author: Collin Funk 
Date:   Mon Jun 10 06:36:47 2024 +

AVX-512: Pacify -Wshift-overflow=2. [PR115409]

A shift of 31 on a signed int is undefined behavior.  Since unsigned
int is 32-bits wide this change fixes it and silences the warning.

gcc/ChangeLog:

PR target/115409
* config/i386/avx512fp16intrin.h (_mm512_conj_pch): Make the
constant unsigned before shifting.
* config/i386/avx512fp16vlintrin.h (_mm256_conj_pch): Likewise.
(_mm_conj_pch): Likewise.

Signed-off-by: Collin Funk

[Bug target/115342] [14/15 Regression] AArch64: Function multiversioning initialization incorrect

2024-06-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115342

--- Comment #4 from GCC Commits  ---
The releases/gcc-14 branch has been updated by Wilco Dijkstra
:

https://gcc.gnu.org/g:9421f02916676d27e24fcda918f85e359329ac69

commit r14-10338-g9421f02916676d27e24fcda918f85e359329ac69
Author: Wilco Dijkstra 
Date:   Wed Jun 5 14:04:33 2024 +0100

AArch64: Fix cpu features initialization [PR115342]

The CPU features initialization code uses CPUID registers (rather than
HWCAP).  The equality comparisons it uses are incorrect: for example
FEAT_SVE
is not set if SVE2 is available.  Using HWCAPs for these is both simpler
and
correct.  The initialization must also be done atomically to avoid multiple
threads causing corruption due to non-atomic RMW accesses to the global.

libgcc:
PR target/115342
* config/aarch64/cpuinfo.c (__init_cpu_features_constructor):
Use HWCAP where possible.  Use atomic write for initialization.
Fix FEAT_PREDRES comparison.
(__init_cpu_features_resolver): Use atomic load for correct
initialization.
(__init_cpu_features): Likewise.
(cherry picked from commit d7cbcfe7c33645eaf95f175f19884d443817857b)

[Bug libstdc++/115454] std::experimental::find_last_set is buggy on x86-64-v4

2024-06-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115454

--- Comment #7 from GCC Commits  ---
The releases/gcc-12 branch has been updated by Matthias Kretz
:

https://gcc.gnu.org/g:8b5bdeb8aa2c2f6dbd448a8f7d500d9eaece48e1

commit r12-10574-g8b5bdeb8aa2c2f6dbd448a8f7d500d9eaece48e1
Author: Matthias Kretz 
Date:   Fri Jun 14 15:11:25 2024 +0200

libstdc++: Fix find_last_set(simd_mask) to ignore padding bits

With the change to the AVX512 find_last_set implementation, the change
to AVX512 operator!= is unnecessary. However, the latter was not
producing optimal code and unnecessarily set the padding bits. In
theory, the compiler could determine that with the new !=
implementation, the bit operation for clearing the padding bits is a
no-op and can be elided.

Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:

PR libstdc++/115454
* include/experimental/bits/simd_x86.h (_S_not_equal_to): Use
neq comparison instead of bitwise negation after eq.
(_S_find_last_set): Clear unused high bits before computing
bit_width.
* testsuite/experimental/simd/pr115454_find_last_set.cc: New
test.

(cherry picked from commit 1340ddea0158de3f49aeb75b4013e5fc313ff6f4)

[Bug libstdc++/115575] experimental/simd/pr115454_find_last_set.cc FAILs

2024-06-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115575

--- Comment #10 from GCC Commits  ---
The releases/gcc-12 branch has been updated by Matthias Kretz
:

https://gcc.gnu.org/g:169d4d1addaac7eef6cde4049aa8b4f3d81c28b0

commit r12-10575-g169d4d1addaac7eef6cde4049aa8b4f3d81c28b0
Author: Matthias Kretz 
Date:   Fri Jun 21 16:22:22 2024 +0200

libstdc++: Fix test on x86_64 and non-simd targets

* Running a test compiled with AVX512 instructions requires
avx512f_runtime not just avx512f.

* The 'reduce2' test violated an invariant of fixed_size_simd_mask and
thus failed on all targets without 16-Byte vector builtins enabled (in
bits/simd.h).

Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:

PR libstdc++/115575
* testsuite/experimental/simd/pr115454_find_last_set.cc: Require
avx512f_runtime. Don't memcpy fixed_size masks.

(cherry picked from commit 77f321435b4ac37992c2ed6737ca0caa1dd50551)

[Bug libstdc++/115575] experimental/simd/pr115454_find_last_set.cc FAILs

2024-06-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115575

--- Comment #9 from GCC Commits  ---
The releases/gcc-13 branch has been updated by Matthias Kretz
:

https://gcc.gnu.org/g:c335e34ff89ec9aec1ba874dc5cece9c2303c906

commit r13-8862-gc335e34ff89ec9aec1ba874dc5cece9c2303c906
Author: Matthias Kretz 
Date:   Fri Jun 21 16:22:22 2024 +0200

libstdc++: Fix test on x86_64 and non-simd targets

* Running a test compiled with AVX512 instructions requires
avx512f_runtime not just avx512f.

* The 'reduce2' test violated an invariant of fixed_size_simd_mask and
thus failed on all targets without 16-Byte vector builtins enabled (in
bits/simd.h).

Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:

PR libstdc++/115575
* testsuite/experimental/simd/pr115454_find_last_set.cc: Require
avx512f_runtime. Don't memcpy fixed_size masks.

(cherry picked from commit 77f321435b4ac37992c2ed6737ca0caa1dd50551)

[Bug libstdc++/115454] std::experimental::find_last_set is buggy on x86-64-v4

2024-06-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115454

--- Comment #6 from GCC Commits  ---
The releases/gcc-13 branch has been updated by Matthias Kretz
:

https://gcc.gnu.org/g:fbd088a069b172cae4e268abe2d38e567ef97990

commit r13-8861-gfbd088a069b172cae4e268abe2d38e567ef97990
Author: Matthias Kretz 
Date:   Fri Jun 14 15:11:25 2024 +0200

libstdc++: Fix find_last_set(simd_mask) to ignore padding bits

With the change to the AVX512 find_last_set implementation, the change
to AVX512 operator!= is unnecessary. However, the latter was not
producing optimal code and unnecessarily set the padding bits. In
theory, the compiler could determine that with the new !=
implementation, the bit operation for clearing the padding bits is a
no-op and can be elided.

Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:

PR libstdc++/115454
* include/experimental/bits/simd_x86.h (_S_not_equal_to): Use
neq comparison instead of bitwise negation after eq.
(_S_find_last_set): Clear unused high bits before computing
bit_width.
* testsuite/experimental/simd/pr115454_find_last_set.cc: New
test.

(cherry picked from commit 1340ddea0158de3f49aeb75b4013e5fc313ff6f4)

[Bug libstdc++/115575] experimental/simd/pr115454_find_last_set.cc FAILs

2024-06-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115575

--- Comment #8 from GCC Commits  ---
The releases/gcc-11 branch has been updated by Matthias Kretz
:

https://gcc.gnu.org/g:87cda03e707f9f3e049905a0f698221f3c7db148

commit r11-11531-g87cda03e707f9f3e049905a0f698221f3c7db148
Author: Matthias Kretz 
Date:   Fri Jun 21 16:22:22 2024 +0200

libstdc++: Fix test on x86_64 and non-simd targets

* Running a test compiled with AVX512 instructions requires
avx512f_runtime not just avx512f.

* The 'reduce2' test violated an invariant of fixed_size_simd_mask and
thus failed on all targets without 16-Byte vector builtins enabled (in
bits/simd.h).

Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:

PR libstdc++/115575
* testsuite/experimental/simd/pr115454_find_last_set.cc: Require
avx512f_runtime. Don't memcpy fixed_size masks.

(cherry picked from commit 77f321435b4ac37992c2ed6737ca0caa1dd50551)

[Bug libstdc++/115497] [15 Regression] __is_pointer doesn't compile with clang since 014879ea4c86b3b8ab6b61a1226ee5b31e816c8b

2024-06-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115497

--- Comment #21 from GCC Commits  ---
The master branch has been updated by Jonathan Wakely :

https://gcc.gnu.org/g:51cc77672add517123ef9ea45335b08442e8d57c

commit r15-1552-g51cc77672add517123ef9ea45335b08442e8d57c
Author: Jonathan Wakely 
Date:   Wed Jun 19 11:19:58 2024 +0100

libstdc++: Remove std::__is_void class template [PR115497]

This removes the std::__is_void trait, as it conflicts with a Clang
built-in. There is only one use of the trait, which can easily be
replaced by simpler code.

Although Clang has a hack to make the class template work despite using
a reserved name, removing std::__is_void will allow that hack to be
dropped at some future date.

libstdc++-v3/ChangeLog:

PR libstdc++/115497
* include/bits/cpp_type_traits.h (__is_void): Remove.
* include/debug/helper_functions.h (_Distance_traits):
Adjust partial specialization to match void directly, instead of
using __is_void::__type and matching __true_type.

[Bug libstdc++/115575] experimental/simd/pr115454_find_last_set.cc FAILs

2024-06-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115575

--- Comment #7 from GCC Commits  ---
The releases/gcc-14 branch has been updated by Matthias Kretz
:

https://gcc.gnu.org/g:a851931bc0d7a2b39ccc1f236015aabf24ee51f9

commit r14-10337-ga851931bc0d7a2b39ccc1f236015aabf24ee51f9
Author: Matthias Kretz 
Date:   Fri Jun 21 16:22:22 2024 +0200

libstdc++: Fix test on x86_64 and non-simd targets

* Running a test compiled with AVX512 instructions requires
avx512f_runtime not just avx512f.

* The 'reduce2' test violated an invariant of fixed_size_simd_mask and
thus failed on all targets without 16-Byte vector builtins enabled (in
bits/simd.h).

Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:

PR libstdc++/115575
* testsuite/experimental/simd/pr115454_find_last_set.cc: Require
avx512f_runtime. Don't memcpy fixed_size masks.

(cherry picked from commit 77f321435b4ac37992c2ed6737ca0caa1dd50551)

[Bug libstdc++/115497] [15 Regression] __is_pointer doesn't compile with clang since 014879ea4c86b3b8ab6b61a1226ee5b31e816c8b

2024-06-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115497

--- Comment #22 from GCC Commits  ---
The master branch has been updated by Jonathan Wakely :

https://gcc.gnu.org/g:52a82359073653e312aaa5703f7e0ce339588961

commit r15-1553-g52a82359073653e312aaa5703f7e0ce339588961
Author: Jonathan Wakely 
Date:   Wed Jun 19 17:26:37 2024 +0100

libstdc++: Remove std::__is_pointer and std::__is_scalar [PR115497]

This removes the std::__is_pointer and std::__is_scalar traits, as they
conflicts with a Clang built-in.

Although Clang has a hack to make the class templates work despite using
reserved names, removing these class templates will allow that hack to
be dropped at some future date.

libstdc++-v3/ChangeLog:

PR libstdc++/115497
* include/bits/cpp_type_traits.h (__is_pointer, __is_scalar):
Remove.
(__is_arithmetic): Do not use __is_pointer in the primary
template. Add partial specialization for pointers.

[Bug libstdc++/115497] [15 Regression] __is_pointer doesn't compile with clang since 014879ea4c86b3b8ab6b61a1226ee5b31e816c8b

2024-06-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115497

--- Comment #20 from GCC Commits  ---
The master branch has been updated by Jonathan Wakely :

https://gcc.gnu.org/g:5f10547e021db3a4a34382cd067668f9ef97fdeb

commit r15-1551-g5f10547e021db3a4a34382cd067668f9ef97fdeb
Author: Jonathan Wakely 
Date:   Wed Jun 19 17:21:16 2024 +0100

libstdc++: Stop using std::__is_pointer in  and 
[PR115497]

This replaces all uses of the std::__is_pointer type trait with uses of
the new __is_pointer built-in. Since the class template was only used to
enable some performance optimizations for algorithms, we can use the
built-in when __has_builtin(__is_pointer) is true (which is the case for
GCC trunk and for current versions of Clang) and just forego the
optimization otherwise.

Removing the uses of std::__is_pointer means it can be removed from
, which is another step towards fixing PR
115497.

libstdc++-v3/ChangeLog:

PR libstdc++/115497
* include/bits/deque.tcc (__lex_cmp_dit): Replace __is_pointer
class template with __is_pointer(T) built-in.
(__lexicographical_compare_aux1): Likewise.
* include/bits/stl_algobase.h (__equal_aux1): Likewise.
(__lexicographical_compare_aux1): Likewise.

[Bug libstdc++/115497] [15 Regression] __is_pointer doesn't compile with clang since 014879ea4c86b3b8ab6b61a1226ee5b31e816c8b

2024-06-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115497

--- Comment #19 from GCC Commits  ---
The master branch has been updated by Jonathan Wakely :

https://gcc.gnu.org/g:139d65d1f5a60ac90479653a4f9b63618509f3f9

commit r15-1550-g139d65d1f5a60ac90479653a4f9b63618509f3f9
Author: Jonathan Wakely 
Date:   Wed Jun 19 11:19:58 2024 +0100

libstdc++: Don't use std::__is_scalar in std::valarray initialization
[PR115497]

This removes the use of the std::__is_scalar trait from ,
where it can be replaced by __is_trivial. It's used to decide whether we
can use memset to value-initialize valarray elements, but memset is
suitable for any trivial types, because value-initializing them is
equivalent to filling them with zeros.

This is another step towards removing the class templates in
 that conflict with Clang built-in names.

libstdc++-v3/ChangeLog:

PR libstdc++/115497
* include/bits/valarray_array.h (__valarray_default_construct):
Use __is_trivial(_Tp). instead of __is_scalar<_Tp>.

[Bug libstdc++/115497] [15 Regression] __is_pointer doesn't compile with clang since 014879ea4c86b3b8ab6b61a1226ee5b31e816c8b

2024-06-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115497

--- Comment #18 from GCC Commits  ---
The master branch has been updated by Jonathan Wakely :

https://gcc.gnu.org/g:b3743181899c5490a94c4dbde56a69ab77a40f11

commit r15-1549-gb3743181899c5490a94c4dbde56a69ab77a40f11
Author: Jonathan Wakely 
Date:   Wed Jun 19 16:14:56 2024 +0100

libstdc++: Fix std::fill and std::fill_n optimizations [PR109150]

As noted in the PR, the optimization used for scalar types in std::fill
and std::fill_n is non-conforming, because it doesn't consider that
assigning a scalar type might have non-trivial side effects which are
affected by the optimization.

By changing the condition under which the optimization is done we ensure
it's only performed when safe to do so, and we also enable it for
additional types, which was the original subject of the PR.

Instead of two overloads using __enable_if<__is_scalar::__value, R>
we can combine them into one and create a local variable which is either
a local copy of __value or another reference to it, depending on whether
the optimization is allowed.

This removes a use of std::__is_scalar, which is a step towards fixing
PR 115497 by removing std::__is_pointer from 

libstdc++-v3/ChangeLog:

PR libstdc++/109150
* include/bits/stl_algobase.h (__fill_a1): Combine the
!__is_scalar and __is_scalar overloads into one and rewrite the
condition used to decide whether to perform the load outside the
loop.
* testsuite/25_algorithms/fill/109150.cc: New test.
* testsuite/25_algorithms/fill_n/109150.cc: New test.

[Bug libstdc++/109150] std::fill should use __gnu_cxx::__is_scalar overloads for all scalars

2024-06-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109150

--- Comment #8 from GCC Commits  ---
The master branch has been updated by Jonathan Wakely :

https://gcc.gnu.org/g:b3743181899c5490a94c4dbde56a69ab77a40f11

commit r15-1549-gb3743181899c5490a94c4dbde56a69ab77a40f11
Author: Jonathan Wakely 
Date:   Wed Jun 19 16:14:56 2024 +0100

libstdc++: Fix std::fill and std::fill_n optimizations [PR109150]

As noted in the PR, the optimization used for scalar types in std::fill
and std::fill_n is non-conforming, because it doesn't consider that
assigning a scalar type might have non-trivial side effects which are
affected by the optimization.

By changing the condition under which the optimization is done we ensure
it's only performed when safe to do so, and we also enable it for
additional types, which was the original subject of the PR.

Instead of two overloads using __enable_if<__is_scalar::__value, R>
we can combine them into one and create a local variable which is either
a local copy of __value or another reference to it, depending on whether
the optimization is allowed.

This removes a use of std::__is_scalar, which is a step towards fixing
PR 115497 by removing std::__is_pointer from 

libstdc++-v3/ChangeLog:

PR libstdc++/109150
* include/bits/stl_algobase.h (__fill_a1): Combine the
!__is_scalar and __is_scalar overloads into one and rewrite the
condition used to decide whether to perform the load outside the
loop.
* testsuite/25_algorithms/fill/109150.cc: New test.
* testsuite/25_algorithms/fill_n/109150.cc: New test.

[Bug libstdc++/115575] experimental/simd/pr115454_find_last_set.cc FAILs

2024-06-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115575

--- Comment #6 from GCC Commits  ---
The master branch has been updated by Matthias Kretz :

https://gcc.gnu.org/g:77f321435b4ac37992c2ed6737ca0caa1dd50551

commit r15-1548-g77f321435b4ac37992c2ed6737ca0caa1dd50551
Author: Matthias Kretz 
Date:   Fri Jun 21 16:22:22 2024 +0200

libstdc++: Fix test on x86_64 and non-simd targets

* Running a test compiled with AVX512 instructions requires
avx512f_runtime not just avx512f.

* The 'reduce2' test violated an invariant of fixed_size_simd_mask and
thus failed on all targets without 16-Byte vector builtins enabled (in
bits/simd.h).

Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:

PR libstdc++/115575
* testsuite/experimental/simd/pr115454_find_last_set.cc: Require
avx512f_runtime. Don't memcpy fixed_size masks.

[Bug testsuite/109360] RFE: check that generated .sarif files validate against the SARIF schema

2024-06-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109360

--- Comment #2 from GCC Commits  ---
The master branch has been updated by David Malcolm :

https://gcc.gnu.org/g:a84fe222029ff21903283cc8ee4bc760ebf80ec2

commit r15-1541-ga84fe222029ff21903283cc8ee4bc760ebf80ec2
Author: David Malcolm 
Date:   Fri Jun 21 08:46:14 2024 -0400

testsuite: check that generated .sarif files validate against the SARIF
schema [PR109360]

This patch extends the dg directive verify-sarif-file so that if
the "jsonschema" tool is available, it will be used to validate the
generated .sarif file.

Tested with jsonschema 3.2 with Python 3.8

gcc/ChangeLog:
PR testsuite/109360
* doc/install.texi: Mention optional usage of "jsonschema" tool.

gcc/testsuite/ChangeLog:
PR testsuite/109360
* lib/sarif-schema-2.1.0.json: New file, downloaded from
   
https://docs.oasis-open.org/sarif/sarif/v2.1.0/os/schemas/sarif-schema-2.1.0.json
Licensing information can be seen at
https://github.com/oasis-tcs/sarif-spec/issues/583
which states "They are free to incorporate it into their
implementation. No need for special permission or paperwork from
OASIS."
* lib/scansarif.exp (verify-sarif-file): If "jsonschema" is
available, use it to verify that the .sarif file complies with the
SARIF schema.
* lib/target-supports.exp (check_effective_target_jsonschema):
New.

Signed-off-by: David Malcolm

[Bug testsuite/109360] RFE: check that generated .sarif files validate against the SARIF schema

2024-06-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109360

--- Comment #1 from GCC Commits  ---
The master branch has been updated by David Malcolm :

https://gcc.gnu.org/g:9f4fdc3acebcf6b045edea1361570658da4bc0ab

commit r15-1540-g9f4fdc3acebcf6b045edea1361570658da4bc0ab
Author: David Malcolm 
Date:   Fri Jun 21 08:46:13 2024 -0400

diagnostics: fixes to SARIF output [PR109360]

When adding validation of .sarif files against the schema
(PR testsuite/109360) I discovered various issues where we were
generating invalid .sarif files.

Specifically, in
  c-c++-common/diagnostic-format-sarif-file-bad-utf8-pr109098-1.c
the relatedLocations for the "note" diagnostics were missing column
numbers, leading to validation failure due to non-unique elements,
such as multiple:
"message": {"text": "invalid UTF-8 character "}},
on line 25 with no column information.

Root cause is that for some diagnostics in libcpp we have a location_t
representing the line as a whole, setting a column_override on the
rich_location (since the line hasn't been fully read yet).  We were
handling this column override for plain text output, but not for .sarif
output.

Similarly, in diagnostic-format-sarif-file-pr111700.c there is a warning
emitted on "line 0" of the file, whereas SARIF requires line numbers to
be positive.

We also use column == 0 internally to mean "the line as a whole",
whereas SARIF required column numbers to be positive.

This patch fixes these various issues.

gcc/ChangeLog:
PR testsuite/109360
* diagnostic-format-sarif.cc
(sarif_builder::make_location_object): Pass any column override
from rich_loc to maybe_make_physical_location_object.
(sarif_builder::maybe_make_physical_location_object): Add
"column_override" param and pass it to maybe_make_region_object.
(sarif_builder::maybe_make_region_object): Add "column_override"
param and use it when the location has 0 for a column.  Don't
add "startLine", "startColumn", "endLine", or "endColumn" if
the values aren't positive.
(sarif_builder::maybe_make_region_object_for_context): Don't
add "startLine" or "endLine" if the values aren't positive.

libcpp/ChangeLog:
PR testsuite/109360
* include/rich-location.h (rich_location::get_column_override):
New accessor.

Signed-off-by: David Malcolm

[Bug tree-optimization/115278] [13/14 Regression] -ftree-vectorize optimizes away volatile write on x86_64 since r13-3219

2024-06-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115278

--- Comment #11 from GCC Commits  ---
The releases/gcc-14 branch has been updated by Richard Biener
:

https://gcc.gnu.org/g:272e8c90af527fc1d0055ad0f17f1d97bb0bd6cb

commit r14-10335-g272e8c90af527fc1d0055ad0f17f1d97bb0bd6cb
Author: Richard Biener 
Date:   Fri May 31 10:14:25 2024 +0200

tree-optimization/115278 - fix DSE in if-conversion wrt volatiles

The following adds the missing guard for volatile stores to the
embedded DSE in the loop if-conversion pass.

PR tree-optimization/115278
* tree-if-conv.cc (ifcvt_local_dce): Do not DSE volatile stores.

* g++.dg/vect/pr115278.cc: New testcase.

(cherry picked from commit 65dbe0ab7cdaf2aa84b09a74e594f0faacf1945c)

[Bug tree-optimization/115508] [14 regression] ICE when building flac with -O2 -march=znver1 since r14-5603-g2b59e2b4dff421

2024-06-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115508

--- Comment #16 from GCC Commits  ---
The releases/gcc-14 branch has been updated by Richard Biener
:

https://gcc.gnu.org/g:65e25860f49ee7a2cfd4872db06d94ed7675e12e

commit r14-10334-g65e25860f49ee7a2cfd4872db06d94ed7675e12e
Author: Richard Biener 
Date:   Mon Jun 17 14:36:56 2024 +0200

tree-optimization/115508 - fix ICE with SLP scheduling and extern vector

When there's a permute after an extern vector we can run into a case
that didn't consider the scheduled node being a permute which lacks
a representative.

PR tree-optimization/115508
* tree-vect-slp.cc (vect_schedule_slp_node): Guard check on
representative.

* gcc.target/i386/pr115508.c: New testcase.

(cherry picked from commit 65e72b95c63a5501cf1482f3814ae8c8e672bf06)

[Bug tree-optimization/110176] [11 Regression] wrong code at -Os and above on x86_64-linux-gnu since r11-2446

2024-06-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110176

--- Comment #14 from GCC Commits  ---
The releases/gcc-11 branch has been updated by Richard Biener
:

https://gcc.gnu.org/g:0d0f181dedb928a6dbb9af040a09cda3f4d5da64

commit r11-11530-g0d0f181dedb928a6dbb9af040a09cda3f4d5da64
Author: Richard Biener 
Date:   Wed Jan 31 14:40:24 2024 +0100

middle-end/110176 - wrong zext (bool) <= (int) 4294967295u folding

The following fixes a wrong pattern that didn't match the behavior
of the original fold_widened_comparison in that get_unwidened
returned a constant always in the wider type.  But here we're
using (int) 4294967295u without the conversion applied.  Fixed
by doing as earlier in the pattern - matching constants only
if the conversion was actually applied.

PR middle-end/110176
* match.pd (zext (bool) <= (int) 4294967295u): Make sure
to match INTEGER_CST only without outstanding conversion.

* gcc.dg/torture/pr110176.c: New testcase.

(cherry picked from commit 22dbfbe8767ff4c1d93e39f68ec7c2d5b1358beb)

[Bug tree-optimization/111039] [11 Regression] Unable to coalesce ssa_names

2024-06-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111039

--- Comment #7 from GCC Commits  ---
The releases/gcc-11 branch has been updated by Richard Biener
:

https://gcc.gnu.org/g:bae5dcf29c6cb1f0da437858aa7214811ece53a5

commit r11-11528-gbae5dcf29c6cb1f0da437858aa7214811ece53a5
Author: Richard Biener 
Date:   Thu Aug 17 13:10:14 2023 +0200

tree-optimization/111039 - abnormals and bit test merging

The following guards the bit test merging code in if-combine against
the appearance of SSA names used in abnormal PHIs.

PR tree-optimization/111039
* tree-ssa-ifcombine.c (ifcombine_ifandif): Check for
SSA_NAME_OCCURS_IN_ABNORMAL_PHI.

* gcc.dg/pr111039.c: New testcase.

[Bug debug/111080] [11 Regression] restrict qualifier causes extra debug info to happen

2024-06-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111080

--- Comment #8 from GCC Commits  ---
The releases/gcc-11 branch has been updated by Richard Biener
:

https://gcc.gnu.org/g:80ded4e8f871c98481bab912997034b9d24b1c96

commit r11-11527-g80ded4e8f871c98481bab912997034b9d24b1c96
Author: Richard Biener 
Date:   Mon Aug 21 10:34:30 2023 +0200

debug/111080 - avoid outputting debug info for unused restrict qualified
type

The following applies some maintainance with respect to type qualifiers
and kinds added by later DWARF standards to prune_unused_types_walk.
The particular case in the bug is not handling (thus marking required)
all restrict qualified type DIEs.  I've found more DW_TAG_*_type that
are unhandled, looked up the DWARF docs and added them as well based
on common sense.

PR debug/111080
* dwarf2out.c (prune_unused_types_walk): Handle
DW_TAG_restrict_type, DW_TAG_shared_type, DW_TAG_atomic_type,
DW_TAG_immutable_type, DW_TAG_coarray_type, DW_TAG_unspecified_type
and DW_TAG_dynamic_type as to only output them when referenced.

* gcc.dg/debug/dwarf2/pr111080.c: New testcase.

[Bug tree-optimization/112495] [11 Regression] ICE: verify_gimple failed (after vectorizer) with named address space (__seg_gs )

2024-06-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112495

--- Comment #8 from GCC Commits  ---
The releases/gcc-11 branch has been updated by Richard Biener
:

https://gcc.gnu.org/g:20fe647365a922c6dd7a7f283abb99b5588476e8

commit r11-11525-g20fe647365a922c6dd7a7f283abb99b5588476e8
Author: Richard Biener 
Date:   Mon Nov 13 10:20:37 2023 +0100

tree-optimization/112495 - alias versioning and address spaces

We are not correctly handling differing address spaces in dependence
analysis runtime alias check generation so refuse to do that.

PR tree-optimization/112495
* tree-data-ref.c (runtime_alias_check_p): Reject checks
between different address spaces.

* gcc.target/i386/pr112495.c: New testcase.

(cherry picked from commit 0f593c0521caab8cfac53514b1a5e7d0d0dd1932)

[Bug tree-optimization/114027] [11 Regression] miscompile at `-O3 -fno-vect-cost-model -msse4.2`

2024-06-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114027

--- Comment #23 from GCC Commits  ---
The releases/gcc-11 branch has been updated by Richard Biener
:

https://gcc.gnu.org/g:b6a029286d5034d63063ae78f406ba677c37d015

commit r11-11520-gb6a029286d5034d63063ae78f406ba677c37d015
Author: Richard Biener 
Date:   Thu Feb 22 10:50:12 2024 +0100

tree-optimization/114027 - conditional reduction chain

When we classify a conditional reduction chain as CONST_COND_REDUCTION
we fail to verify all involved conditionals have the same constant.
That's a quite unlikely situation so the following simply disables
such classification when there's more than one reduction statement.

PR tree-optimization/114027
* tree-vect-loop.c (vecctorizable_reduction): Use optimized
condition reduction classification only for single-element
chains.

* gcc.dg/vect/pr114027.c: New testcase.

(cherry picked from commit 549f251f055e3a0b0084189a3012c4f15d635e75)

--- Comment #24 from GCC Commits  ---
The releases/gcc-11 branch has been updated by Richard Biener
:

https://gcc.gnu.org/g:70ebb2ecbbcdfc40e0beff95dd11c9d678694888

commit r11-11521-g70ebb2ecbbcdfc40e0beff95dd11c9d678694888
Author: Richard Biener 
Date:   Tue Mar 26 09:46:06 2024 +0100

tree-optimization/114027 - fix testcase

The following fixes out-of-bounds read in the testcase.

PR tree-optimization/114027
* gcc.dg/vect/pr114027.c: Fix iteration count.

(cherry picked from commit 4470611e20f3217ee81647b01fda65b6a62229aa)

[Bug tree-optimization/111070] [14 Regregression] ./gcc.target/tic6x/abi-align-1.c on x86_64 with -O1

2024-06-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111070

--- Comment #10 from GCC Commits  ---
The releases/gcc-11 branch has been updated by Richard Biener
:

https://gcc.gnu.org/g:8d7ff01933c18532c82c864bd9182db619fcab43

commit r11-11529-g8d7ff01933c18532c82c864bd9182db619fcab43
Author: Richard Biener 
Date:   Mon Aug 21 09:01:00 2023 +0200

tree-optimization/111070 - fix ICE with recent ifcombine fix

We now got test coverage for non-SSA name bits so the following amends
the SSA_NAME_OCCURS_IN_ABNORMAL_PHI checks.

PR tree-optimization/111070
* tree-ssa-ifcombine.c (ifcombine_ifandif): Check we have
an SSA name before checking SSA_NAME_OCCURS_IN_ABNORMAL_PHI.

* gcc.dg/pr111070.c: New testcase.

[Bug tree-optimization/111445] [12 Regression] Wrong code at -Os on x86_64-linux-gnu since r12-1077-g57bf3751511

2024-06-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111445

--- Comment #12 from GCC Commits  ---
The releases/gcc-11 branch has been updated by Richard Biener
:

https://gcc.gnu.org/g:ce653fa683643d4e5a7146181954358f1060dfed

commit r11-11526-gce653fa683643d4e5a7146181954358f1060dfed
Author: Richard Biener 
Date:   Fri Oct 20 15:08:49 2023 +0200

tree-optimization/111445 - simple_iv simplification fault

The following fixes a missed check in the simple_iv attempt
to simplify (signed T)((unsigned T) base + step) where it
allows a truncating inner conversion leading to wrong code.

PR tree-optimization/111445
* tree-scalar-evolution.c (simple_iv_with_niters):
Add missing check for a sign-conversion.

* gcc.dg/torture/pr111445.c: New testcase.

(cherry picked from commit 9692309ed6b625f0fb358c0e230404b5603f69a6)

[Bug tree-optimization/114027] [11 Regression] miscompile at `-O3 -fno-vect-cost-model -msse4.2`

2024-06-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114027

--- Comment #23 from GCC Commits  ---
The releases/gcc-11 branch has been updated by Richard Biener
:

https://gcc.gnu.org/g:b6a029286d5034d63063ae78f406ba677c37d015

commit r11-11520-gb6a029286d5034d63063ae78f406ba677c37d015
Author: Richard Biener 
Date:   Thu Feb 22 10:50:12 2024 +0100

tree-optimization/114027 - conditional reduction chain

When we classify a conditional reduction chain as CONST_COND_REDUCTION
we fail to verify all involved conditionals have the same constant.
That's a quite unlikely situation so the following simply disables
such classification when there's more than one reduction statement.

PR tree-optimization/114027
* tree-vect-loop.c (vecctorizable_reduction): Use optimized
condition reduction classification only for single-element
chains.

* gcc.dg/vect/pr114027.c: New testcase.

(cherry picked from commit 549f251f055e3a0b0084189a3012c4f15d635e75)

--- Comment #24 from GCC Commits  ---
The releases/gcc-11 branch has been updated by Richard Biener
:

https://gcc.gnu.org/g:70ebb2ecbbcdfc40e0beff95dd11c9d678694888

commit r11-11521-g70ebb2ecbbcdfc40e0beff95dd11c9d678694888
Author: Richard Biener 
Date:   Tue Mar 26 09:46:06 2024 +0100

tree-optimization/114027 - fix testcase

The following fixes out-of-bounds read in the testcase.

PR tree-optimization/114027
* gcc.dg/vect/pr114027.c: Fix iteration count.

(cherry picked from commit 4470611e20f3217ee81647b01fda65b6a62229aa)

[Bug tree-optimization/112505] [11 Regression] internal compiler error: in build_vector_from_val, at tree.cc:2104 since r10-4076

2024-06-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112505

--- Comment #9 from GCC Commits  ---
The releases/gcc-11 branch has been updated by Richard Biener
:

https://gcc.gnu.org/g:ffaa61eb15dce3e48b4dcbca7161fc79ac9734b8

commit r11-11524-gffaa61eb15dce3e48b4dcbca7161fc79ac9734b8
Author: Richard Biener 
Date:   Thu Jan 11 14:00:33 2024 +0100

tree-optimization/112505 - bit-precision induction vectorization

Vectorization of bit-precision inductions isn't implemented but we
don't check this, instead we ICE during transform.

PR tree-optimization/112505
* tree-vect-loop.c (vectorizable_induction): Reject
bit-precision induction.

* gcc.dg/vect/pr112505.c: New testcase.

(cherry picked from commit ec345df53556ec581590347f71c3d9ff3cdbca76)

[Bug tree-optimization/112793] [11 regression] ICE when building stellarium (internal compiler error: in vect_schedule_slp_node, at tree-vect-slp.cc:9062)

2024-06-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112793

--- Comment #17 from GCC Commits  ---
The releases/gcc-11 branch has been updated by Richard Biener
:

https://gcc.gnu.org/g:7e4dedbf9ff64964e356521d03a06838cbd4ca2e

commit r11-11522-g7e4dedbf9ff64964e356521d03a06838cbd4ca2e
Author: Richard Biener 
Date:   Wed Dec 13 14:23:31 2023 +0100

tree-optimization/112793 - SLP of constant/external code-generated twice

The following makes the attempt at code-generating a constant/external
SLP node twice well-formed as that can happen when partitioning BB
vectorization attempts where we keep constants/externals unpartitioned.

PR tree-optimization/112793
* tree-vect-slp.c (vect_schedule_slp_node): Already
code-generated constant/external nodes are OK.

* g++.dg/vect/pr112793.cc: New testcase.

(cherry picked from commit d782ec8362eadc3169286eb1e39c631effd02323)

[Bug debug/112718] [11 Regression] ICE: in add_dwarf_attr, at dwarf2out.cc:4501 with -g -fdebug-types-section -flto -ffat-lto-objects

2024-06-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112718

--- Comment #7 from GCC Commits  ---
The releases/gcc-11 branch has been updated by Richard Biener
:

https://gcc.gnu.org/g:33e6997ef0ea30f57089038a15e2a33fcbd81648

commit r11-11523-g33e6997ef0ea30f57089038a15e2a33fcbd81648
Author: Richard Biener 
Date:   Mon Jan 22 15:42:59 2024 +0100

debug/112718 - reset all type units with -ffat-lto-objects

When mixing -flto, -ffat-lto-objects and -fdebug-type-section we
fail to reset all type units after early output resulting in an
ICE when attempting to add then duplicate sibling attributes.

PR debug/112718
* dwarf2out.c (dwarf2out_finish): Reset all type units
for the fat part of an LTO compile.

* gcc.dg/debug/pr112718.c: New testcase.

(cherry picked from commit 7218f5050cb7163edae331f54ca163248ab48bfa)

[Bug target/114734] [11 regression] RISC-V rv64gcv_zvl256b miscompile with -flto -O3 -mrvv-vector-bits=zvl since r8-6047-g65dd1346027bb5

2024-06-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114734

--- Comment #22 from GCC Commits  ---
The releases/gcc-11 branch has been updated by Richard Biener
:

https://gcc.gnu.org/g:a43c9bea7bd80af33ae116e7197f814ad911857e

commit r11-11519-ga43c9bea7bd80af33ae116e7197f814ad911857e
Author: Richard Biener 
Date:   Fri Apr 26 15:47:13 2024 +0200

middle-end/114734 - wrong code with expand_call_mem_ref

When expand_call_mem_ref looks at the definition of the address
argument to eventually expand a _MEM_REF argument together
with a masked load it fails to honor constraints imposed by SSA
coalescing decisions.  The following fixes this.

PR middle-end/114734
* internal-fn.c (expand_call_mem_ref): Use
get_gimple_for_ssa_name to get at the def stmt of the address
argument to honor SSA coalescing constraints.

(cherry picked from commit 20ebcaf826c91ddaf2aac35417ec1e5e6d31ad50)

[Bug libstdc++/115522] [13/14/15 Regression] std::to_array no longer works for struct which is trivial but not default constructible

2024-06-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115522

--- Comment #8 from GCC Commits  ---
The master branch has been updated by Jonathan Wakely :

https://gcc.gnu.org/g:510ce5eed69ee1bea9c2c696fe3b2301e16d1486

commit r15-1533-g510ce5eed69ee1bea9c2c696fe3b2301e16d1486
Author: Jonathan Wakely 
Date:   Tue Jun 18 13:27:02 2024 +0100

libstdc++: Fix std::to_array for trivial-ish types [PR115522]

Due to PR c++/85723 the std::is_trivial trait is true for types with a
deleted default constructor, so the use of std::is_trivial in
std::to_array is not sufficient to ensure the type can be trivially
default constructed then filled using memcpy.

I also forgot that a type with a deleted assignment operator can still
be trivial, so we also need to check that it's assignable because the
is_constant_evaluated() path can't use memcpy.

Replace the uses of std::is_trivial with std::is_trivially_copyable
(needed for memcpy), std::is_trivially_default_constructible (needed so
that the default construction is valid and does no work) and
std::is_copy_assignable (needed for the constant evaluation case).

libstdc++-v3/ChangeLog:

PR libstdc++/115522
* include/std/array (to_array): Workaround the fact that
std::is_trivial is not sufficient to check that a type is
trivially default constructible and assignable.
* testsuite/23_containers/array/creation/115522.cc: New test.

[Bug c++/85723] [C++17][DR 1496] __is_trivial intrinsic fails with no trivial non-deleted default c'tor

2024-06-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85723

--- Comment #6 from GCC Commits  ---
The master branch has been updated by Jonathan Wakely :

https://gcc.gnu.org/g:510ce5eed69ee1bea9c2c696fe3b2301e16d1486

commit r15-1533-g510ce5eed69ee1bea9c2c696fe3b2301e16d1486
Author: Jonathan Wakely 
Date:   Tue Jun 18 13:27:02 2024 +0100

libstdc++: Fix std::to_array for trivial-ish types [PR115522]

Due to PR c++/85723 the std::is_trivial trait is true for types with a
deleted default constructor, so the use of std::is_trivial in
std::to_array is not sufficient to ensure the type can be trivially
default constructed then filled using memcpy.

I also forgot that a type with a deleted assignment operator can still
be trivial, so we also need to check that it's assignable because the
is_constant_evaluated() path can't use memcpy.

Replace the uses of std::is_trivial with std::is_trivially_copyable
(needed for memcpy), std::is_trivially_default_constructible (needed so
that the default construction is valid and does no work) and
std::is_copy_assignable (needed for the constant evaluation case).

libstdc++-v3/ChangeLog:

PR libstdc++/115522
* include/std/array (to_array): Workaround the fact that
std::is_trivial is not sufficient to check that a type is
trivially default constructible and assignable.
* testsuite/23_containers/array/creation/115522.cc: New test.

[Bug middle-end/68855] PAREN_EXPR not "ignored" where possible

2024-06-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68855

--- Comment #11 from GCC Commits  ---
The trunk branch has been updated by Andrew Pinski :

https://gcc.gnu.org/g:59221dc587f369695d9b0c2f73aedf8458931f0f

commit r15-1508-g59221dc587f369695d9b0c2f73aedf8458931f0f
Author: Andrew Pinski 
Date:   Thu Jun 20 15:52:05 2024 -0700

complex-lowering: Better handling of PAREN_EXPR [PR68855]

When PAREN_EXPR tree code was added in r0-85884-gdedd42d511b6e4,
a simplified handling was added to complex lowering. Which means
we would get:
```
  _9 = COMPLEX_EXPR <_15, _14>;
  _11 = ((_9));
  _19 = REALPART_EXPR <_11>;
  _20 = IMAGPART_EXPR <_11>;
```

In many cases instead of just simply:
```
  _19 = ((_15));
  _20 = ((_14));
```

So this adds full support for PAREN_EXPR to complex lowering.
It is handled very similar as NEGATE_EXPR; except creating PAREN_EXPR
instead of NEGATE_EXPR for the real/imag parts. This allows for
more optimizations including vectorization, especially with
-ffast-math.
gfortran.dg/vect/pr68855.f90 is an example where this could show up.
It also shows up in SPEC CPU 2006's 465.tonto; though I have not done
any benchmarking there.

Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

PR tree-optimization/68855
* tree-complex.cc (init_dont_simulate_again): Handle PAREN_EXPR
like NEGATE_EXPR.
(complex_propagate::visit_stmt): Likewise.
(expand_complex_move): Don't handle PAREN_EXPR.
(expand_complex_paren): New function.
(expand_complex_operations_1): Handle PAREN_EXPR like
NEGATE_EXPR. And call expand_complex_paren for PAREN_EXPR.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/pr68855.c: New test.
* gfortran.dg/vect/pr68855.f90: New test.

Signed-off-by: Andrew Pinski

[Bug target/115355] [12/13/14/15 Regression] vectorization exposes wrong code on P9 LE starting from r12-4496

2024-06-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115355

--- Comment #13 from GCC Commits  ---
The master branch has been updated by Kewen Lin :

https://gcc.gnu.org/g:52c112800d9f44457c4832309a48c00945811313

commit r15-1504-g52c112800d9f44457c4832309a48c00945811313
Author: Kewen Lin 
Date:   Thu Jun 20 20:23:56 2024 -0500

rs6000: Fix wrong RTL patterns for vector merge high/low word on LE

Commit r12-4496 changes some define_expands and define_insns
for vector merge high/low word, which are altivec_vmrg[hl]w,
vsx_xxmrg[hl]w_.  These defines are mainly for
built-in function vec_merge{h,l}, __builtin_vsx_xxmrghw,
__builtin_vsx_xxmrghw_4si and some internal gen function
needs.  These functions should consider endianness, taking
vec_mergeh as example, as PVIPR defines, vec_mergeh "Merges
the first halves (in element order) of two vectors", it does
note it's in element order.  So it's mapped into vmrghw on
BE while vmrglw on LE respectively.  Although the mapped
insns are different, as the discussion in PR106069, the RTL
pattern should be still the same, it is conformed before
commit r12-4496, define_expand altivec_vmrghw got expanded
into:

  (vec_select:VSX_W
 (vec_concat:
(match_operand:VSX_W 1 "register_operand" "wa,v")
(match_operand:VSX_W 2 "register_operand" "wa,v"))
(parallel [(const_int 0) (const_int 4)
   (const_int 1) (const_int 5)])))]

on both BE and LE then.  But commit r12-4496 changed it to
expand into:

  (vec_select:VSX_W
 (vec_concat:
(match_operand:VSX_W 1 "register_operand" "wa,v")
(match_operand:VSX_W 2 "register_operand" "wa,v"))
(parallel [(const_int 0) (const_int 4)
   (const_int 1) (const_int 5)])))]

on BE, and

  (vec_select:VSX_W
 (vec_concat:
(match_operand:VSX_W 1 "register_operand" "wa,v")
(match_operand:VSX_W 2 "register_operand" "wa,v"))
(parallel [(const_int 2) (const_int 6)
   (const_int 3) (const_int 7)])))]

on LE, although the mapped insn are still vmrghw on BE and
vmrglw on LE, the associated RTL pattern is completely
wrong and inconsistent with the mapped insn.  If optimization
passes leave this pattern alone, even if its pattern doesn't
represent its mapped insn, it's still fine, that's why simple
testing on bif doesn't expose this issue.  But once some
optimization pass such as combine does some changes basing
on this wrong pattern, because the pattern doesn't match the
semantics that the expanded insn is intended to represent,
it would cause the unexpected result.

So this patch is to fix the wrong RTL pattern, ensure the
associated RTL patterns become the same as before which can
have the same semantic as their mapped insns.  With the
proposed patch, the expanders like altivec_vmrghw expands
into altivec_vmrghb_direct_be or altivec_vmrglb_direct_le
depending on endianness, "direct" can easily show which
insn would be generated, _be and _le are mainly for the
different RTL patterns as endianness.

Co-authored-by: Xionghu Luo 

PR target/106069
PR target/115355

gcc/ChangeLog:

* config/rs6000/altivec.md (altivec_vmrghw_direct_):
Rename
to ...
(altivec_vmrghw_direct__be): ... this.  Add the
condition
BYTES_BIG_ENDIAN.
(altivec_vmrghw_direct__le): New define_insn.
(altivec_vmrglw_direct_): Rename to ...
(altivec_vmrglw_direct__be): ... this.  Add the
condition
BYTES_BIG_ENDIAN.
(altivec_vmrglw_direct__le): New define_insn.
(altivec_vmrghw): Adjust by calling
gen_altivec_vmrghw_direct_v4si_be
for BE and gen_altivec_vmrglw_direct_v4si_le for LE.
(altivec_vmrglw): Adjust by calling
gen_altivec_vmrglw_direct_v4si_be
for BE and gen_altivec_vmrghw_direct_v4si_le for LE.
(vec_widen_umult_hi_v8hi): Adjust the call to
gen_altivec_vmrghw_direct_v4si by gen_altivec_vmrghw for BE
and by gen_altivec_vmrglw for LE.
(vec_widen_smult_hi_v8hi): Likewise.
(vec_widen_umult_lo_v8hi): Adjust the call to
gen_altivec_vmrglw_direct_v4si by gen_altivec_vmrglw for BE
and by gen_altivec_vmrghw for LE
(vec_widen_smult_lo_v8hi): Likewise.
* config/rs6000/rs6000.cc (altivec_expand_vec_perm_const): Replace
CODE_FOR_altivec_vmrghw_direct_v4si by
CODE_FOR_altivec_vmrghw_direct_v4si_be for BE and
CODE_FOR_altivec_vmrghw_direct_v4si_le for LE.  And replace
CODE_FOR_altivec_vmrglw_direct_v4si by
CODE_FOR_altivec_vmrglw_direct_v4si_be for BE and
CODE_FOR_altivec_vmrglw_direct_v4si_le for LE.

[Bug target/106069] [12/13/14/15 Regression] wrong code with -O -fno-tree-forwprop -maltivec on ppc64le

2024-06-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069

--- Comment #40 from GCC Commits  ---
The master branch has been updated by Kewen Lin :

https://gcc.gnu.org/g:52c112800d9f44457c4832309a48c00945811313

commit r15-1504-g52c112800d9f44457c4832309a48c00945811313
Author: Kewen Lin 
Date:   Thu Jun 20 20:23:56 2024 -0500

rs6000: Fix wrong RTL patterns for vector merge high/low word on LE

Commit r12-4496 changes some define_expands and define_insns
for vector merge high/low word, which are altivec_vmrg[hl]w,
vsx_xxmrg[hl]w_.  These defines are mainly for
built-in function vec_merge{h,l}, __builtin_vsx_xxmrghw,
__builtin_vsx_xxmrghw_4si and some internal gen function
needs.  These functions should consider endianness, taking
vec_mergeh as example, as PVIPR defines, vec_mergeh "Merges
the first halves (in element order) of two vectors", it does
note it's in element order.  So it's mapped into vmrghw on
BE while vmrglw on LE respectively.  Although the mapped
insns are different, as the discussion in PR106069, the RTL
pattern should be still the same, it is conformed before
commit r12-4496, define_expand altivec_vmrghw got expanded
into:

  (vec_select:VSX_W
 (vec_concat:
(match_operand:VSX_W 1 "register_operand" "wa,v")
(match_operand:VSX_W 2 "register_operand" "wa,v"))
(parallel [(const_int 0) (const_int 4)
   (const_int 1) (const_int 5)])))]

on both BE and LE then.  But commit r12-4496 changed it to
expand into:

  (vec_select:VSX_W
 (vec_concat:
(match_operand:VSX_W 1 "register_operand" "wa,v")
(match_operand:VSX_W 2 "register_operand" "wa,v"))
(parallel [(const_int 0) (const_int 4)
   (const_int 1) (const_int 5)])))]

on BE, and

  (vec_select:VSX_W
 (vec_concat:
(match_operand:VSX_W 1 "register_operand" "wa,v")
(match_operand:VSX_W 2 "register_operand" "wa,v"))
(parallel [(const_int 2) (const_int 6)
   (const_int 3) (const_int 7)])))]

on LE, although the mapped insn are still vmrghw on BE and
vmrglw on LE, the associated RTL pattern is completely
wrong and inconsistent with the mapped insn.  If optimization
passes leave this pattern alone, even if its pattern doesn't
represent its mapped insn, it's still fine, that's why simple
testing on bif doesn't expose this issue.  But once some
optimization pass such as combine does some changes basing
on this wrong pattern, because the pattern doesn't match the
semantics that the expanded insn is intended to represent,
it would cause the unexpected result.

So this patch is to fix the wrong RTL pattern, ensure the
associated RTL patterns become the same as before which can
have the same semantic as their mapped insns.  With the
proposed patch, the expanders like altivec_vmrghw expands
into altivec_vmrghb_direct_be or altivec_vmrglb_direct_le
depending on endianness, "direct" can easily show which
insn would be generated, _be and _le are mainly for the
different RTL patterns as endianness.

Co-authored-by: Xionghu Luo 

PR target/106069
PR target/115355

gcc/ChangeLog:

* config/rs6000/altivec.md (altivec_vmrghw_direct_):
Rename
to ...
(altivec_vmrghw_direct__be): ... this.  Add the
condition
BYTES_BIG_ENDIAN.
(altivec_vmrghw_direct__le): New define_insn.
(altivec_vmrglw_direct_): Rename to ...
(altivec_vmrglw_direct__be): ... this.  Add the
condition
BYTES_BIG_ENDIAN.
(altivec_vmrglw_direct__le): New define_insn.
(altivec_vmrghw): Adjust by calling
gen_altivec_vmrghw_direct_v4si_be
for BE and gen_altivec_vmrglw_direct_v4si_le for LE.
(altivec_vmrglw): Adjust by calling
gen_altivec_vmrglw_direct_v4si_be
for BE and gen_altivec_vmrghw_direct_v4si_le for LE.
(vec_widen_umult_hi_v8hi): Adjust the call to
gen_altivec_vmrghw_direct_v4si by gen_altivec_vmrghw for BE
and by gen_altivec_vmrglw for LE.
(vec_widen_smult_hi_v8hi): Likewise.
(vec_widen_umult_lo_v8hi): Adjust the call to
gen_altivec_vmrglw_direct_v4si by gen_altivec_vmrglw for BE
and by gen_altivec_vmrghw for LE
(vec_widen_smult_lo_v8hi): Likewise.
* config/rs6000/rs6000.cc (altivec_expand_vec_perm_const): Replace
CODE_FOR_altivec_vmrghw_direct_v4si by
CODE_FOR_altivec_vmrghw_direct_v4si_be for BE and
CODE_FOR_altivec_vmrghw_direct_v4si_le for LE.  And replace
CODE_FOR_altivec_vmrglw_direct_v4si by
CODE_FOR_altivec_vmrglw_direct_v4si_be for BE and
CODE_FOR_altivec_vmrglw_direct_v4si_le for LE.

[Bug libstdc++/115454] std::experimental::find_last_set is buggy on x86-64-v4

2024-06-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115454

--- Comment #5 from GCC Commits  ---
The releases/gcc-11 branch has been updated by Matthias Kretz
:

https://gcc.gnu.org/g:a0b92a530ad61389c0cdeb8d8ece4677e019c28e

commit r11-11517-ga0b92a530ad61389c0cdeb8d8ece4677e019c28e
Author: Matthias Kretz 
Date:   Fri Jun 14 15:11:25 2024 +0200

libstdc++: Fix find_last_set(simd_mask) to ignore padding bits

With the change to the AVX512 find_last_set implementation, the change
to AVX512 operator!= is unnecessary. However, the latter was not
producing optimal code and unnecessarily set the padding bits. In
theory, the compiler could determine that with the new !=
implementation, the bit operation for clearing the padding bits is a
no-op and can be elided.

Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:

PR libstdc++/115454
* include/experimental/bits/simd_x86.h (_S_not_equal_to): Use
neq comparison instead of bitwise negation after eq.
(_S_find_last_set): Clear unused high bits before computing
bit_width.
* testsuite/experimental/simd/pr115454_find_last_set.cc: New
test.

(cherry picked from commit 4787960dcaf0de3f46464960f5246de9b3c69a06)

[Bug libstdc++/115454] std::experimental::find_last_set is buggy on x86-64-v4

2024-06-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115454

--- Comment #4 from GCC Commits  ---
The releases/gcc-14 branch has been updated by Matthias Kretz
:

https://gcc.gnu.org/g:e77f314ccd422ffa05c6c06837e2cb94c25f2bba

commit r14-10331-ge77f314ccd422ffa05c6c06837e2cb94c25f2bba
Author: Matthias Kretz 
Date:   Fri Jun 14 15:11:25 2024 +0200

libstdc++: Fix find_last_set(simd_mask) to ignore padding bits

With the change to the AVX512 find_last_set implementation, the change
to AVX512 operator!= is unnecessary. However, the latter was not
producing optimal code and unnecessarily set the padding bits. In
theory, the compiler could determine that with the new !=
implementation, the bit operation for clearing the padding bits is a
no-op and can be elided.

Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:

PR libstdc++/115454
* include/experimental/bits/simd_x86.h (_S_not_equal_to): Use
neq comparison instead of bitwise negation after eq.
(_S_find_last_set): Clear unused high bits before computing
bit_width.
* testsuite/experimental/simd/pr115454_find_last_set.cc: New
test.

(cherry picked from commit 1340ddea0158de3f49aeb75b4013e5fc313ff6f4)

[Bug driver/115440] unrecognized command-line option '--c++17'; did you mean '--stdc++17'?

2024-06-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115440

--- Comment #8 from GCC Commits  ---
The releases/gcc-11 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:a0dac8fdf477f0ee7fa4f54bbfc4cafec944b042

commit r11-11516-ga0dac8fdf477f0ee7fa4f54bbfc4cafec944b042
Author: Jakub Jelinek 
Date:   Mon Jun 17 22:02:46 2024 +0200

diagnostics: Fix add_misspelling_candidates [PR115440]

The option_map array for most entries contains just non-NULL opt0
{ "-Wno-", NULL, "-W", false, true },
{ "-fno-", NULL, "-f", false, true },
{ "-gno-", NULL, "-g", false, true },
{ "-mno-", NULL, "-m", false, true },
{ "--debug=", NULL, "-g", false, false },
{ "--machine-", NULL, "-m", true, false },
{ "--machine-no-", NULL, "-m", false, true },
{ "--machine=", NULL, "-m", false, false },
{ "--machine=no-", NULL, "-m", false, true },
{ "--machine", "", "-m", false, false },
{ "--machine", "no-", "-m", false, true },
{ "--optimize=", NULL, "-O", false, false },
{ "--std=", NULL, "-std=", false, false },
{ "--std", "", "-std=", false, false },
{ "--warn-", NULL, "-W", true, false },
{ "--warn-no-", NULL, "-W", false, true },
{ "--", NULL, "-f", true, false },
{ "--no-", NULL, "-f", false, true }
and so add_misspelling_candidates works correctly for it, but 3 out of
these,
{ "--machine", "", "-m", false, false },
{ "--machine", "no-", "-m", false, true },
and
{ "--std", "", "-std=", false, false },
use non-NULL opt1.  That says that
--machine foo
should map to
-mfoo
and
--machine no-foo
should map to
-mno-foo
and
--std c++17
should map to
-std=c++17
add_misspelling_canidates was not handling this, so it hapilly
registered say
--stdc++17
or
--machineavx512
(twice) as spelling alternatives, when those options aren't recognized.
Instead we support
--std c++17
or
--machine avx512
--machine no-avx512

The following patch fixes that.  On this particular testcase, we no longer
suggest anything, even when among the suggestion is say that
--std c++17
or
-std=c++17
etc.

2024-06-17  Jakub Jelinek  

PR driver/115440
* opts-common.c (add_misspelling_candidates): If opt1 is non-NULL,
add a space and opt1 to the alternative suggestion text.

* g++.dg/cpp1z/pr115440.C: New test.

(cherry picked from commit 96db57948b50f45235ae4af3b46db66cae7ea859)

[Bug rtl-optimization/115092] [14/15 Regression] wrong code at -O1 with "-fgcse -ftree-pre -fno-tree-dominator-opts -fno-tree-fre -fno-guess-branch-probability" on x86_64-linux-gnu since r14-4810

2024-06-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115092

--- Comment #16 from GCC Commits  ---
The releases/gcc-11 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:82bb4ee090342f2b787661420f701f2e0bf6624a

commit r11-11512-g82bb4ee090342f2b787661420f701f2e0bf6624a
Author: Jakub Jelinek 
Date:   Wed May 15 18:37:17 2024 +0200

combine: Fix up simplify_compare_const [PR115092]

The following testcases are miscompiled (with tons of GIMPLE
optimization disabled) because combine sees GE comparison of
1-bit sign_extract (i.e. something with [-1, 0] value range)
with (const_int -1) (which is always true) and optimizes it into
NE comparison of 1-bit zero_extract ([0, 1] value range) against
(const_int 0).
The reason is that simplify_compare_const first (correctly)
simplifies the comparison to
GE (ashift:SI something (const_int 31)) (const_int -2147483648)
and then an optimization for when the second operand is power of 2
triggers.  That optimization is fine for power of 2s which aren't
the signed minimum of the mode, or if it is NE, EQ, GEU or LTU
against the signed minimum of the mode, but for GE or LT optimizing
it into NE (or EQ) against const0_rtx is wrong, those cases
are always true or always false (but the function doesn't have
a standardized way to tell callers the comparison is now unconditional).

The following patch just disables the optimization in that case.

2024-05-15  Jakub Jelinek  

PR rtl-optimization/114902
PR rtl-optimization/115092
* combine.c (simplify_compare_const): Don't optimize
GE op0 SIGNED_MIN or LT op0 SIGNED_MIN into NE op0 const0_rtx or
EQ op0 const0_rtx.

* gcc.dg/pr114902.c: New test.
* gcc.dg/pr115092.c: New test.

(cherry picked from commit 0b93a0ae153ef70a82ff63e67926a01fdab9956b)

[Bug c/114493] [11 Regression] internal compiler error: in fld_incomplete_type_of with may_alias

2024-06-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114493

--- Comment #17 from GCC Commits  ---
The releases/gcc-11 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:5736bc05ea1546328cad9b6cefb5a389d2a2994e

commit r11-11515-g5736bc05ea1546328cad9b6cefb5a389d2a2994e
Author: Jakub Jelinek 
Date:   Thu Jun 6 22:12:11 2024 +0200

c: Fix up pointer types to may_alias structures [PR114493]

The following testcase ICEs in ipa-free-lang, because the
fld_incomplete_type_of
  gcc_assert (TYPE_CANONICAL (t2) != t2
  && TYPE_CANONICAL (t2) == TYPE_CANONICAL (TREE_TYPE
(t)));
assertion doesn't hold.
This is because t is a struct S * type which was created while struct S
was still incomplete and without the may_alias attribute (and
TYPE_CANONICAL
of a pointer type is a type created with can_alias_all = false argument),
while later on on the struct definition may_alias attribute was used.
fld_incomplete_type_of then creates an incomplete distinct copy of the
structure (but with the original attributes) but pointers created for it
are because of the "may_alias" attribute TYPE_REF_CAN_ALIAS_ALL, including
their TYPE_CANONICAL, because while that is created with !can_alias_all
argument, we later set it because of the "may_alias" attribute on the
to_type.

This doesn't ICE with C++ since PR70512 fix because the C++ FE sets
TYPE_REF_CAN_ALIAS_ALL on all pointer types to the class type (and its
variants) when the may_alias is added.

The following patch does that in the C FE as well.

2024-06-06  Jakub Jelinek  

PR c/114493
* c-decl.c (c_fixup_may_alias): New function.
(finish_struct): Call it if "may_alias" attribute is
specified.

* gcc.dg/pr114493-1.c: New test.
* gcc.dg/pr114493-2.c: New test.

(cherry picked from commit d5a3c6d43acb8b2211d9fb59d59482d74c010f01)

[Bug fortran/114825] [11 Regression] Compiler error using gfortran and OpenMP since r5-1190

2024-06-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114825

--- Comment #11 from GCC Commits  ---
The releases/gcc-11 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:0ff2aad2f7242fffa646d06b3da7d28191e1c788

commit r11-11509-g0ff2aad2f7242fffa646d06b3da7d28191e1c788
Author: Jakub Jelinek 
Date:   Thu Apr 25 20:09:35 2024 +0200

openmp: Copy DECL_LANG_SPECIFIC and DECL_LANG_FLAG_? to tree-nested decl
copy [PR114825]

tree-nested.cc creates in 2 spots artificial VAR_DECLs, one of them is used
both for debug info and OpenMP/OpenACC lowering purposes, the other solely
for
OpenMP/OpenACC lowering purposes.
When the decls are used in OpenMP/OpenACC lowering, the OMP langhooks
(mostly
Fortran, C just a little and C++ doesn't have nested functions) then
inspect
the flags on the vars and based on that decide how to lower the
corresponding
clauses.

Unfortunately we weren't copying DECL_LANG_SPECIFIC and DECL_LANG_FLAG_?,
so
the langhooks made decisions on the default flags on those instead.
As the original decl isn't necessarily a VAR_DECL, could be e.g. PARM_DECL,
using copy_node wouldn't work properly, so this patch just copies those
flags in addition to other flags it was copying already.  And I've removed
code duplication by introducing a helper function which does copying common
to both uses.

2024-04-25  Jakub Jelinek  

PR fortran/114825
* tree-nested.c (get_debug_decl): New function.
(get_nonlocal_debug_decl): Use it.
(get_local_debug_decl): Likewise.

* gfortran.dg/gomp/pr114825.f90: New test.

(cherry picked from commit 14d48516e588ad2b35e2007b3970bdcb1b3f145c)

[Bug tree-optimization/115337] wrong code with _BitInt() __builtin_stdc_first_leading_one/__builtin_clzg (with -1 as second arg) at -O2

2024-06-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115337

--- Comment #16 from GCC Commits  ---
The releases/gcc-11 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:b724525c9779bbf93c4905dc3d54296f5e39e607

commit r11-11514-gb724525c9779bbf93c4905dc3d54296f5e39e607
Author: Jakub Jelinek 
Date:   Tue Jun 4 15:49:41 2024 +0200

fold-const: Fix up CLZ handling in tree_call_nonnegative_warnv_p [PR115337]

The function currently incorrectly assumes all the __builtin_clz* and .CLZ
calls have non-negative result.  That is the case of the former which is UB
on zero and has [0, prec-1] return value otherwise, and is the case of the
single argument .CLZ as well (again, UB on zero), but for two argument
.CLZ is the case only if the second argument is also nonnegative (or if we
know the argument can't be zero, but let's do that just in the ranger
IMHO).

The following patch does that.

2024-06-04  Jakub Jelinek  

PR tree-optimization/115337
* fold-const.c (tree_call_nonnegative_warnv_p) :
If fn is CFN_CLZ, use CLZ_DEFINED_VALUE_AT.

(cherry picked from commit b82a816000791e7a286c7836b3a473ec0e2a577b)

[Bug sanitizer/114956] [11 Regression] Segmentation fault with -fsanitize=address -fsanitize=null -O2 when attribute no_sanitize_address is enabled since r9-5742

2024-06-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114956

--- Comment #11 from GCC Commits  ---
The releases/gcc-11 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:e4bd9558d431bee2f4a76875dd79c9a76d2c58ff

commit r11-11511-ge4bd9558d431bee2f4a76875dd79c9a76d2c58ff
Author: Jakub Jelinek 
Date:   Tue May 7 21:29:14 2024 +0200

tree-inline: Remove .ASAN_MARK calls when inlining functions into
no_sanitize callers [PR114956]

In r9-5742 we've started allowing to inline always_inline functions into
functions which have disabled e.g. address sanitization even when the
always_inline function is implicitly from command line options sanitized.

This mostly works fine because most of the asan instrumentation is done
only
late after ipa, but as the following testcase the .ASAN_MARK ifn calls
gimplifier adds can result in ICEs.

Fixed by dropping those during inlining, similarly to how we drop
.TSAN_FUNC_EXIT calls.

2024-05-07  Jakub Jelinek  

PR sanitizer/114956
* tree-inline.c: Include asan.h.
(copy_bb): Remove also .ASAN_MARK calls if id->dst_fn has
asan/hwasan
sanitization disabled.

* gcc.dg/asan/pr114956.c: New test.

(cherry picked from commit d4e25cf4f7c1f51a8824cc62bbb85a81a41b829a)

[Bug tree-optimization/114876] [11 Regression] -fprintf-return-value mishandles %lc with a '\0' argument.

2024-06-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114876

--- Comment #13 from GCC Commits  ---
The releases/gcc-11 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:6bb5036635090b924519adde22e7be9c80f6e76a

commit r11-11510-g6bb5036635090b924519adde22e7be9c80f6e76a
Author: Jakub Jelinek 
Date:   Tue Apr 30 11:22:32 2024 +0200

gimple-ssa-sprintf: Use [0, 1] range for %lc with (wint_t) 0 argument
[PR114876]

Seems when Martin S. implemented this, he coded there strict reading
of the standard, which said that %lc with (wint_t) 0 argument is handled
as wchar_t[2] temp = { arg, 0 }; %ls with temp arg and so shouldn't print
any values.  But, most of the libc implementations actually handled that
case like %c with '\0' argument, adding a single NUL character, the only
known exception is musl.
Recently, C23 changed this in response to GB-141 and POSIX in
https://austingroupbugs.net/view.php?id=1647
so that it should have the same behavior as %c with '\0'.

Because there is implementation divergence, the following patch uses
a range rather than hardcoding it to all 1s (i.e. the %c behavior),
though the likely case is still 1 (forward looking plus most of
implementations).
The res.knownrange = true; assignment removed is redundant due to
the same assignment done unconditionally before the if statement,
rest is formatting fixes.

I don't think the min >= 0 && min < 128 case is right either, I'd think
it should be min >= 0 && max < 128, otherwise it is just some possible
inputs are (maybe) ASCII and there can be others, but this code is a total
mess anyway, with the min, max, likely (somewhere in [min, max]?) and then
unlikely possibly larger than max, dunno, perhaps for at least some chars
in the ASCII range the likely case could be for the ascii case; so perhaps
just the one_2_one_ascii shouldn't set max to 1 and mayfail should be true
for max >= 128.  Anyway, didn't feel I should touch that right now.

2024-04-30  Jakub Jelinek  

PR tree-optimization/114876
* gimple-ssa-sprintf.c (format_character): For min == 0 && max ==
0,
set max, likely and unlikely members to 1 rather than 0.  Remove
useless res.knownrange = true;.  Formatting fixes.

* gcc.dg/pr114876.c: New test.
* gcc.dg/tree-ssa/builtin-sprintf-warn-1.c: Adjust expected
diagnostics.

(cherry picked from commit 6c6b70f07208ca14ba783933988c04c6fc2fff42)

[Bug middle-end/108789] __builtin_(add|mul|sub)_overflow methods generate duplicate operations if both operands are const which in turn causes wrong code due to overlapping arguments

2024-06-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108789

--- Comment #12 from GCC Commits  ---
The releases/gcc-11 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:2922060a7fe04397df1ca69dd1f4de1edf8d2282

commit r11-11513-g2922060a7fe04397df1ca69dd1f4de1edf8d2282
Author: Jakub Jelinek 
Date:   Tue Jun 4 12:28:01 2024 +0200

builtins: Force SAVE_EXPR for __builtin_{add,sub,mul}_overflow [PR108789]

The following testcase is miscompiled, because we use save_expr
on the .{ADD,SUB,MUL}_OVERFLOW call we are creating, but if the first
two operands are not INTEGER_CSTs (in that case we just fold it right away)
but are TREE_READONLY/!TREE_SIDE_EFFECTS, save_expr doesn't actually
create a SAVE_EXPR at all and so we lower it to
*arg2 = REALPART_EXPR (.ADD_OVERFLOW (arg0, arg1)), \
IMAGPART_EXPR (.ADD_OVERFLOW (arg0, arg1))
which evaluates the ifn twice and just hope it will be CSEd back.
As *arg2 aliases *arg0, that is not the case.
The builtins are really never const/pure as they store into what
the third arguments points to, so after handling the
INTEGER_CST+INTEGER_CST
case, I think we should just always use SAVE_EXPR.  Just building SAVE_EXPR
by hand and setting TREE_SIDE_EFFECTS on it doesn't work, because
c_fully_fold optimizes it away again, so the following patch marks the
ifn calls as TREE_SIDE_EFFECTS (but doesn't do it for the
__builtin_{add,sub,mul}_overflow_p case which were designed for use
especially in constant expressions and don't really evaluate the
realpart side, so we don't really need a SAVE_EXPR in that case).

2024-06-04  Jakub Jelinek  

PR middle-end/108789
* builtins.c (fold_builtin_arith_overflow): For ovf_only,
don't call save_expr and don't build REALPART_EXPR, otherwise
set TREE_SIDE_EFFECTS on call before calling save_expr.

* gcc.c-torture/execute/pr108789.c: New test.

(cherry picked from commit b8e28381cb5c0cddfe5201faf799d8b27f5d7d6c)

[Bug rtl-optimization/114902] [14 Regression] wrong code at -O3 with "-fno-tree-vrp -fno-expensive-optimizations -fno-tree-dominator-opts" on x86_64-linux-gnu

2024-06-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114902

--- Comment #19 from GCC Commits  ---
The releases/gcc-11 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:82bb4ee090342f2b787661420f701f2e0bf6624a

commit r11-11512-g82bb4ee090342f2b787661420f701f2e0bf6624a
Author: Jakub Jelinek 
Date:   Wed May 15 18:37:17 2024 +0200

combine: Fix up simplify_compare_const [PR115092]

The following testcases are miscompiled (with tons of GIMPLE
optimization disabled) because combine sees GE comparison of
1-bit sign_extract (i.e. something with [-1, 0] value range)
with (const_int -1) (which is always true) and optimizes it into
NE comparison of 1-bit zero_extract ([0, 1] value range) against
(const_int 0).
The reason is that simplify_compare_const first (correctly)
simplifies the comparison to
GE (ashift:SI something (const_int 31)) (const_int -2147483648)
and then an optimization for when the second operand is power of 2
triggers.  That optimization is fine for power of 2s which aren't
the signed minimum of the mode, or if it is NE, EQ, GEU or LTU
against the signed minimum of the mode, but for GE or LT optimizing
it into NE (or EQ) against const0_rtx is wrong, those cases
are always true or always false (but the function doesn't have
a standardized way to tell callers the comparison is now unconditional).

The following patch just disables the optimization in that case.

2024-05-15  Jakub Jelinek  

PR rtl-optimization/114902
PR rtl-optimization/115092
* combine.c (simplify_compare_const): Don't optimize
GE op0 SIGNED_MIN or LT op0 SIGNED_MIN into NE op0 const0_rtx or
EQ op0 const0_rtx.

* gcc.dg/pr114902.c: New test.
* gcc.dg/pr115092.c: New test.

(cherry picked from commit 0b93a0ae153ef70a82ff63e67926a01fdab9956b)

[Bug middle-end/114753] from_chars aborts with -m32 -ftrapv when passed -9223372036854775808

2024-06-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114753

--- Comment #14 from GCC Commits  ---
The releases/gcc-11 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:2e932260ca4f2ba2549eb42d60e701d2244dab74

commit r11-11507-g2e932260ca4f2ba2549eb42d60e701d2244dab74
Author: Jakub Jelinek 
Date:   Thu Apr 18 09:45:14 2024 +0200

internal-fn: Temporarily disable flag_trapv during .{ADD,SUB,MUL}_OVERFLOW
etc. expansion [PR114753]

__builtin_{add,sub,mul}_overflow{,_p} builtins are well defined
for all inputs even for -ftrapv, and the -fsanitize=signed-integer-overflow
ifns shouldn't abort in libgcc but emit the desired ubsan diagnostics
or abort depending on -fsanitize* setting regardless of -ftrapv.
The expansion of these internal functions uses expand_expr* in various
places (e.g. MULT_EXPR at least in 2 spots), so temporarily disabling
flag_trapv in all those spots would be hard.
The following patch disables it around the bodies of 3 functions
which can do the expand_expr calls.
If it was in the C++ FE, I'd use some RAII sentinel, but I don't think
we have one in the middle-end.

2024-04-18  Jakub Jelinek  

PR middle-end/114753
* internal-fn.c (expand_mul_overflow): Save flag_trapv and
temporarily clear it for the duration of the function, then
restore previous value.
(expand_vector_ubsan_overflow): Likewise.
(expand_arith_overflow): Likewise.

* gcc.dg/pr114753.c: New test.

(cherry picked from commit 6c152c9db3b5b9d43e12846fb7a44977c0b65fc2)

[Bug c++/114634] [11 Regression] Crash Issue Encountered in GCC Compilation of Template Code with Aligned Attribute since r9-1745

2024-06-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114634

--- Comment #10 from GCC Commits  ---
The releases/gcc-11 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:139f129bf4f0d40f1e6fb619c044bc0ef699a014

commit r11-11506-g139f129bf4f0d40f1e6fb619c044bc0ef699a014
Author: Jakub Jelinek 
Date:   Mon Apr 15 10:25:22 2024 +0200

attribs: Don't crash on NULL TREE_TYPE in diag_attr_exclusions [PR114634]

The enumerator still doesn't have TREE_TYPE set but diag_attr_exclusions
assumes that all decls must have types.
I think it is better in something as unimportant as diag_attr_exclusions
to be more robust, if there is no type, it can just diagnose exclusions
on the DECL_ATTRIBUTES, like for types it only diagnoses it on
TYPE_ATTRIBUTES.

2024-04-15  Jakub Jelinek  

PR c++/114634
* attribs.c (diag_attr_exclusions): Set attrs[1] to NULL_TREE for
decls with NULL TREE_TYPE.

* g++.dg/ext/attrib68.C: New test.

(cherry picked from commit 7ec54f5fdfec298812a749699874db4d6a7246bb)

[Bug rtl-optimization/114768] Volatile reads can be optimized away

2024-06-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114768

--- Comment #14 from GCC Commits  ---
The releases/gcc-11 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:1fc4a915c797d3a98a327dfc546948c5879336e0

commit r11-11508-g1fc4a915c797d3a98a327dfc546948c5879336e0
Author: Jakub Jelinek 
Date:   Fri Apr 19 08:47:53 2024 +0200

rtlanal: Fix set_noop_p for volatile loads or stores [PR114768]

On the following testcase, combine propagates the mem/v load into mem store
with the same address and then removes it, because noop_move_p says it is a
no-op move.  If it was the other way around, i.e. mem/v store and mem load,
or both would be mem/v, it would be kept.
The problem is that rtx_equal_p never checks any kind of flags on the rtxes
(and I think it would be quite dangerous to change it at this point), and
set_noop_p checks side_effects_p on just one of the operands, not both.
In the MEM <- MEM set, it only checks it on the destination, in
store to ZERO_EXTRACT only checks it on the source.

The following patch adds the missing side_effects_p checks.

2024-04-19  Jakub Jelinek  

PR rtl-optimization/114768
* rtlanal.c (set_noop_p): Don't return true for MEM <- MEM
sets if src has side-effects or for stores into ZERO_EXTRACT
if ZERO_EXTRACT operand has side-effects.

* gcc.dg/pr114768.c: New test.

(cherry picked from commit 9f295847a9c32081bdd0fe908ffba58e830a24fb)

[Bug tree-optimization/114566] [11 Regression] Misaligned vmovaps when compiling with stack-protector-strong for znver4

2024-06-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114566

--- Comment #22 from GCC Commits  ---
The releases/gcc-11 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:327533790760809abd2549e613a676dde5f8cd93

commit r11-11503-g327533790760809abd2549e613a676dde5f8cd93
Author: Jakub Jelinek 
Date:   Fri Apr 5 14:56:14 2024 +0200

vect: Don't clear base_misaligned in update_epilogue_loop_vinfo [PR114566]

The following testcase is miscompiled, because in the vectorized
epilogue the vectorizer assumes it can use aligned loads/stores
(if the base decl gets alignment increased), but it actually doesn't
increase that.
This is because r10-4203-g97c1460367 added the hunk following
patch removes.  The explanation feels reasonable, but actually it
is not true as the testcase proves.
The thing is, we vectorize the main loop with 64-byte vectors
and the corresponding data refs have base_alignment 16 (the
a array has DECL_ALIGN 128) and offset_alignment 32.  Now, because
of the offset_alignment 32 rather than 64, we need to use unaligned
loads/stores in the main loop (and ditto in the first load/store
in vectorized epilogue).  But the second load/store in the vectorized
epilogue uses only 32-byte vectors and because it is a multiple
of offset_alignment, it checks if we could increase alignment of the
a VAR_DECL, the function returns true, sets base_misaligned = true
and says the access is then aligned.
But when update_epilogue_loop_vinfo clears base_misaligned with the
assumption that the var had to have the alignment increased already,
the update of DECL_ALIGN doesn't happen anymore.

Now, I'd think this base_alignment = false was needed before
r10-4030-gd2db7f7901 change was committed where it incorrectly
overwrote DECL_ALIGN even if it was already larger, rather than
just always increasing it.  But with that change in, it doesn't
make sense to me anymore.

Note, the testcase is latent on the trunk, but reproduces on the 13
branch.

2024-04-05  Jakub Jelinek  

PR tree-optimization/114566
* tree-vect-loop.c (update_epilogue_loop_vinfo): Don't clear
base_misaligned.

* gcc.target/i386/avx512f-pr114566.c: New test.

(cherry picked from commit a844095e17c1a5aada1364c6f6eaade87ead463c)

[Bug c++/114691] [11 Regression] Bogus ignoring loop annotation warning

2024-06-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114691

--- Comment #8 from GCC Commits  ---
The releases/gcc-11 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:6ec50f5b9a8842f92d65dbd8fcc546f0f6902585

commit r11-11505-g6ec50f5b9a8842f92d65dbd8fcc546f0f6902585
Author: Jakub Jelinek 
Date:   Fri Apr 12 20:53:10 2024 +0200

c++: Fix bogus warnings about ignored annotations [PR114691]

The middle-end warns about the ANNOTATE_EXPR added for while/for loops
if they declare a var inside of the loop condition.
This is because the assumption is that ANNOTATE_EXPR argument is used
immediately in a COND_EXPR (later GIMPLE_COND), but simplify_loop_decl_cond
wraps the ANNOTATE_EXPR inside of a TRUTH_NOT_EXPR, so it no longer
holds.

The following patch fixes that by adding the TRUTH_NOT_EXPR inside of the
ANNOTATE_EXPR argument if any.

2024-04-12  Jakub Jelinek  

PR c++/114691
* semantics.c (simplify_loop_decl_cond): Use cp_build_unary_op with
TRUTH_NOT_EXPR on ANNOTATE_EXPR argument (if any) rather than
ANNOTATE_EXPR itself.

* g++.dg/ext/pr114691.C: New test.

(cherry picked from commit 91146346f57cc54dfeb2669347edd0eb3d13af7f)

< 1 2 3 4 5 6 7 8 9 10 >

501 - 600 of 19872 matches

Mail list logo