[Bug tree-optimization/114403] [14 regression] LLVM miscompiled with -O3 -march=znver2 -fno-vect-cost-model since r14-6822-g01f4251b8775c8

2024-04-12 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114403 --- Comment #24 from Tamar Christina --- (In reply to Richard Biener from comment #23) > Maybe easier to understand testcase: > > with -O3 -msse4.1 -fno-vect-cost-model we return 20 instead of 8. Adding > -fdisable-tree-cunroll avoids the issu

[Bug tree-optimization/114403] [14 regression] LLVM miscompiled with -O3 -march=znver2 -fno-vect-cost-model since r14-6822-g01f4251b8775c8

2024-04-12 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114403 --- Comment #26 from Tamar Christina --- (In reply to Richard Biener from comment #25) > That means, when the loop takes the early exit we _must_ take that during > the vector iterations. Peeling for gaps means if we would take the early > exit

[Bug tree-optimization/114403] [14 regression] LLVM miscompiled with -O3 -march=znver2 -fno-vect-cost-model since r14-6822-g01f4251b8775c8

2024-04-15 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114403 Tamar Christina changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug tree-optimization/113552] [11/12/13 Regression] vectorizer generates calls to vector math routines with 1 simd lane.

2024-04-15 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113552 Tamar Christina changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug target/114741] [14 regression] aarch64 sve: unnecessary fmov for scalar int bit operations

2024-04-16 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114741 Tamar Christina changed: What|Removed |Added CC||tnfchris at gcc dot gnu.org,

[Bug target/114741] [14 regression] aarch64 sve: unnecessary fmov for scalar int bit operations

2024-04-16 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114741 --- Comment #6 from Tamar Christina --- and the exact armv9-a cost model you quoted, also does the right codegen. https://godbolt.org/z/obafoT6cj There is just an inexplicable penalty being applied to the r->r alternative.

[Bug target/114741] [14 regression] aarch64 sve: unnecessary fmov for scalar int bit operations

2024-04-17 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114741 Tamar Christina changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot gnu.org

[Bug target/114741] [14 regression] aarch64 sve: unnecessary fmov for scalar int bit operations

2024-04-18 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114741 Tamar Christina changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug target/114513] [11/12/13/14 Regression] [aarch64] floating-point registers are used when GPRs are preferred

2024-04-18 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114513 Bug 114513 depends on bug 114741, which changed state. Bug 114741 Summary: [14 regression] aarch64 sve: unnecessary fmov for scalar int bit operations https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114741 What|Removed

[Bug rtl-optimization/114766] New: ^ constraint modifier unexpectedly affects register class selection.

2024-04-18 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114766 Bug ID: 114766 Summary: ^ constraint modifier unexpectedly affects register class selection. Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: missed-op

[Bug target/113625] Interesting behavior with and without -mcpu=generic

2024-04-18 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113625 Tamar Christina changed: What|Removed |Added CC||tnfchris at gcc dot gnu.org --- Comme

[Bug tree-optimization/114769] [14 Regression] Suspicious code in vect_recog_sad_pattern() since r14-1832

2024-04-19 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114769 --- Comment #2 from Tamar Christina --- I believe this is safe, but the interface is definitely not the cleanest. vect_recog_absolute_difference has two callers: 1. vect_recog_sad_pattern where if you return true with unprom not set, then *hal

[Bug tree-optimization/114769] [14 Regression] Suspicious code in vect_recog_sad_pattern() since r14-1832

2024-04-19 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114769 Tamar Christina changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug rtl-optimization/114766] ^ constraint modifier unexpectedly affects register class selection.

2024-04-20 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114766 --- Comment #2 from Tamar Christina --- (In reply to Vladimir Makarov from comment #1) > (In reply to Tamar Christina from comment #0) > > The documentation for ^ states: > > If it works for you, we could try to use the patch (although it needs

[Bug target/114860] [14/15 regression] [aarch64] 511.povray regresses by ~5.5% with -O3 -flto -march=native -mcpu=neoverse-v2 since r14-10014-ga2f4be3dae04fa

2024-04-26 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114860 --- Comment #1 from Tamar Christina --- Hmm I Am unable to reproduce this with -O3 - flto -mcpu=neoverse-v2 on a neoverse-v2 machine. Is any other option required? Also that code was new in gcc 14 and was partially reverted due to register al

[Bug target/114860] [14/15 regression] [aarch64] 511.povray regresses by ~5.5% with -O3 -flto -march=native -mcpu=neoverse-v2 since r14-10014-ga2f4be3dae04fa

2024-04-26 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114860 Tamar Christina changed: What|Removed |Added CC||tnfchris at gcc dot gnu.org --- Comme

[Bug target/114860] [14/15 regression] [aarch64] 511.povray regresses by ~5.5% with -O3 -flto -march=native -mcpu=neoverse-v2 since r14-10014-ga2f4be3dae04fa

2024-04-30 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114860 --- Comment #3 from Tamar Christina --- I cannot reproduce this even recompiling libc.

[Bug ipa/92538] Proposal for IPA init() constant propagation

2024-05-02 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92538 Tamar Christina changed: What|Removed |Added CC||jamborm at gcc dot gnu.org --- Comment

[Bug tree-optimization/114932] New: Improvement in CHREC can give large performance gains

2024-05-02 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932 Bug ID: 114932 Summary: Improvement in CHREC can give large performance gains Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: norm

[Bug tree-optimization/114932] Improvement in CHREC can give large performance gains

2024-05-03 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932 --- Comment #3 from Tamar Christina --- (In reply to Andrew Pinski from comment #2) > > which is harder for prefetchers to follow. > > This seems like a limitation in the HW prefetcher rather than anything else. > Maybe the cost model for addre

[Bug tree-optimization/114932] Improvement in CHREC can give large performance gains

2024-05-03 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932 --- Comment #4 from Tamar Christina --- reduced more: --- module brute_force integer, parameter :: r=9 integer block(r, r, 0) contains subroutine brute do do do do do

[Bug tree-optimization/114932] Improvement in CHREC can give large performance gains

2024-05-03 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932 --- Comment #5 from Tamar Christina --- Created attachment 58095 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58095&action=edit exchange2.fppized-good.f90.187t.ivopts

[Bug tree-optimization/114932] Improvement in CHREC can give large performance gains

2024-05-03 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932 --- Comment #6 from Tamar Christina --- Created attachment 58096 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58096&action=edit exchange2.fppized-bad.f90.187t.ivopts

[Bug tree-optimization/114932] Improvement in CHREC can give large performance gains

2024-05-13 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932 Tamar Christina changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED

[Bug target/114412] [14/15 Regression] 7% slowdown of 436.cactusADM on aarch64

2024-05-13 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114412 Tamar Christina changed: What|Removed |Added CC||tnfchris at gcc dot gnu.org --- Comme

[Bug target/115087] New: dead block not eliminated in SVE intrinsics code

2024-05-14 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115087 Bug ID: 115087 Summary: dead block not eliminated in SVE intrinsics code Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal

[Bug target/114412] [14/15 Regression] 7% slowdown of 436.cactusADM on aarch64

2024-05-16 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114412 --- Comment #5 from Tamar Christina --- (In reply to Filip Kastl from comment #4) > (In reply to Tamar Christina from comment #3) > > Hi Filip, > > > > Do you generate these runs with counters based PGO or compiler > > instrumentation? > > > >

[Bug target/114860] [14/15 regression] [aarch64] 511.povray regresses by ~5.5% with -O3 -flto -march=native -mcpu=neoverse-v2 since r14-10014-ga2f4be3dae04fa

2024-05-16 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114860 --- Comment #7 from Tamar Christina --- Yeah, it's most likely an alignment issue, especially as there's no code changes. We run our benchmarking with different flags so it may be why we don't see it. the loop seems misaligned, you can try incr

[Bug tree-optimization/115120] Bad interaction between ivcanon and early break vectorization

2024-05-17 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115120 --- Comment #3 from Tamar Christina --- That makes sense, though I also wonder how it works for scalar multi exit loops, IVops has various checks on single exits. I guess one problem is that the code in IVops that does this uses the exit to det

[Bug tree-optimization/115130] New: (early-break) [meta-bug] early break vectorization

2024-05-17 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115130 Bug ID: 115130 Summary: (early-break) [meta-bug] early break vectorization Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: meta-bug, missed-optimization Severit

[Bug tree-optimization/115130] (early-break) [meta-bug] early break vectorization

2024-05-17 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115130 Tamar Christina changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed|

[Bug target/114860] [14/15 regression] [aarch64] 511.povray regresses by ~5.5% with -O3 -flto -march=native -mcpu=neoverse-v2 since r14-10014-ga2f4be3dae04fa

2024-05-22 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114860 --- Comment #9 from Tamar Christina --- (In reply to prathamesh3492 from comment #8) > Hi Tamar, > Using -falign-loops=5 indeed brings back the performance. > The adrp instruction has same address (0x4ae784) by setting -falign-loops=5 > (which r

[Bug tree-optimization/114932] IVopts inefficient handling of signed IV used for addressing.

2024-06-05 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932 --- Comment #9 from Tamar Christina --- It's taken me a bit of time to track down all the reasons for the speedup with the earlier patch. This comes from two parts: 1. Signed IVs don't get simplified. Due to possible UB with signed overflows

[Bug tree-optimization/54013] Loop with control flow not vectorized

2024-06-05 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54013 Tamar Christina changed: What|Removed |Added Blocks||115130 --- Comment #4 from Tamar Chris

[Bug tree-optimization/114932] IVopts inefficient handling of signed IV used for addressing.

2024-06-05 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932 --- Comment #11 from Tamar Christina --- (In reply to Richard Biener from comment #10) > I think the question is why IVOPTs ends up using both the signed and > unsigned variant of the same IV instead of expressing all uses of both with > one IV?

[Bug tree-optimization/114932] IVopts inefficient handling of signed IV used for addressing.

2024-06-05 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932 --- Comment #13 from Tamar Christina --- (In reply to rguent...@suse.de from comment #12) > > since we don't care about overflow here, it looks like the stripping should > > be recursive as long as it's a NOP expression between two integral typ

[Bug tree-optimization/114932] IVopts inefficient handling of signed IV used for addressing.

2024-06-06 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932 --- Comment #15 from Tamar Christina --- (In reply to rguent...@suse.de from comment #14) > On Thu, 6 Jun 2024, tnfchris at gcc dot gnu.org wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932 > > > > --- Comment #13 from Tamar Chr

[Bug target/115464] ICE when building libaom on arm64 (neon sve bridge usage with tbl/perm)

2024-06-12 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115464 Tamar Christina changed: What|Removed |Added Last reconfirmed||2024-06-12 CC|

[Bug target/115464] ICE when building libaom on arm64 (neon sve bridge usage with tbl/perm)

2024-06-12 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115464 Tamar Christina changed: What|Removed |Added CC||rsandifo at gcc dot gnu.org --- Comme

[Bug target/115464] ICE when building libaom on arm64 (neon sve bridge usage with tbl/perm)

2024-06-12 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115464 --- Comment #6 from Tamar Christina --- (In reply to Richard Sandiford from comment #5) > In this kind of situation, we should go through a fresh pseudo rather than > try to take the subreg directly. I did try that but fwprop pushed it back in.

[Bug target/115464] ICE when building libaom on arm64 (neon sve bridge usage with tbl/perm)

2024-06-12 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115464 --- Comment #7 from Tamar Christina --- (In reply to Tamar Christina from comment #6) > (In reply to Richard Sandiford from comment #5) > > In this kind of situation, we should go through a fresh pseudo rather than > > try to take the subreg dir

[Bug target/115464] [14 Backport] ICE when building libaom on arm64 (neon sve bridge usage with tbl/perm)

2024-06-13 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115464 --- Comment #10 from Tamar Christina --- Thanks for the fix, but I don't think it's sufficient. what I meant with the earlier comment was that the subregs are broken in general, so not just the one generated by the undef fast path. i.e. #incl

[Bug tree-optimization/115531] New: vectorizer generates inefficient code for masked conditional update loops

2024-06-17 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115531 Bug ID: 115531 Summary: vectorizer generates inefficient code for masked conditional update loops Product: gcc Version: 15.0 Status: UNCONFIRMED Keywords: miss

[Bug tree-optimization/115531] vectorizer generates inefficient code for masked conditional update loops

2024-06-17 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115531 --- Comment #3 from Tamar Christina --- (In reply to Andrew Pinski from comment #1) > I suspect PR 20999 would fix this ... > but we have to be careful since without masked stores, you could still > vectorize this unlike the transformed version.

[Bug tree-optimization/115534] New: intermediate stack use not eliminated

2024-06-18 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115534 Bug ID: 115534 Summary: intermediate stack use not eliminated Product: gcc Version: 15.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Pri

[Bug tree-optimization/115537] New: [15 Regression] vectorizable_reduction ICEs after g:d66b820f392aa9a7c34d3cddaf3d7c73bf23f82d

2024-06-18 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115537 Bug ID: 115537 Summary: [15 Regression] vectorizable_reduction ICEs after g:d66b820f392aa9a7c34d3cddaf3d7c73bf23f82d Product: gcc Version: 15.0 Status: UNCONFIRMED

[Bug middle-end/115534] intermediate stack use not eliminated

2024-06-18 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115534 --- Comment #2 from Tamar Christina --- (In reply to Andrew Pinski from comment #1) > I suspect there is a dup of this already. See the bug which I made this one > blocking for a list of related bugs. Most of the other bugs relate to the argume

[Bug tree-optimization/115537] [15 Regression] vectorizable_reduction ICEs after g:d66b820f392aa9a7c34d3cddaf3d7c73bf23f82d

2024-06-18 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115537 --- Comment #5 from Tamar Christina --- Thanks for the fix! I think the testcase needs SVE enabled to ICE no? shouldn't that be -mcpu=neoverse-v1 and not -mcpu=neoverse-n1?

[Bug middle-end/115534] intermediate stack use not eliminated

2024-06-18 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115534 --- Comment #5 from Tamar Christina --- (In reply to Andrew Pinski from comment #4) > This might be improved by > https://gcc.gnu.org/pipermail/gcc-patches/2024-June/654819.html . Or it > might be the case the vectorizer case needs to be improve

[Bug middle-end/115597] New: [15 Regression] vectorizer takes 20+ h compiling 510.parest in SPECCPU2017 since g:46bb4ce4d30ab749d40f6f4cef6f1fb7c7813452

2024-06-23 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115597 Bug ID: 115597 Summary: [15 Regression] vectorizer takes 20+ h compiling 510.parest in SPECCPU2017 since g:46bb4ce4d30ab749d40f6f4cef6f1fb7c7813452 Product: gcc

[Bug middle-end/115597] [15 Regression] vectorizer takes 20+ h compiling 510.parest in SPECCPU2017 since g:46bb4ce4d30ab749d40f6f4cef6f1fb7c7813452

2024-06-23 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115597 --- Comment #3 from Tamar Christina --- > > Can you check whether that fixes the issue? > > diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc > index 9465d94de1a..212d5f97f7d 100644 > --- a/gcc/tree-vect-slp.cc > +++ b/gcc/tree-vect-slp

[Bug middle-end/115597] [15 Regression] vectorizer takes 20+ h compiling 510.parest in SPECCPU2017 since g:46bb4ce4d30ab749d40f6f4cef6f1fb7c7813452

2024-06-23 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115597 --- Comment #4 from Tamar Christina --- (In reply to Richard Biener from comment #2) > Ah, I feared this would happen - this case seems to be because of a lot of > VEC_PERM nodes(?) which are not handled by the CSE process as well as the > two-o

[Bug tree-optimization/115120] Bad interaction between ivcanon and early break vectorization

2024-06-24 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115120 --- Comment #4 from Tamar Christina --- You asked why this doesn't happen with a normal vector loop Richi. For a normal loop when IVcannon adds the downward counting loop there are two main differences. 1. for a single exit loop, the downward

[Bug c++/115623] ICE: Segmentation fault in finish_for_cond with novector and almost infinite loop

2024-06-24 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115623 Tamar Christina changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigne

[Bug c++/115623] ICE: Segmentation fault in finish_for_cond with novector and almost infinite loop

2024-06-24 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115623 --- Comment #4 from Tamar Christina --- novect3.c: In function 'void f(char*, int)': novect3.c:4:9: error: missing loop condition in loop with 'GCC novector' pragma before ';' token 4 | for (;;i++) | should do it, will send

[Bug tree-optimization/115531] vectorizer generates inefficient code for masked conditional update loops

2024-06-24 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115531 Tamar Christina changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot gnu.org

[Bug tree-optimization/115629] New: Inefficient if-convert of masked conditionals

2024-06-24 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115629 Bug ID: 115629 Summary: Inefficient if-convert of masked conditionals Product: gcc Version: 15.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal

[Bug tree-optimization/115120] Bad interaction between ivcanon and early break vectorization

2024-06-25 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115120 --- Comment #5 from Tamar Christina --- considering ivopts bails out on doloop prediction for multiple exits anyway, what do you think about: diff --git a/gcc/tree-ssa-loop-ivcanon.cc b/gcc/tree-ssa-loop-ivcanon.cc index 5ef24a91917..d1b25ad7de

[Bug libstdc++/88545] std::find compile to memchr in trivial random access cases (patch)

2024-07-01 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88545 --- Comment #11 from Tamar Christina --- (In reply to Jonathan Wakely from comment #9) > Patch posted: https://gcc.gnu.org/pipermail/gcc-patches/2024-June/653731.html > > Rerunning benchmarks with this patch would be very welcome. OK, I have te

[Bug tree-optimization/115629] Inefficient if-convert of masked conditionals

2024-07-01 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115629 --- Comment #4 from Tamar Christina --- (In reply to Richard Biener from comment #3) > So we now tail-merge the two b[i] loading blocks. Can you check SVE > code-gen with this? If that fixes the PR consider adding a SVE testcase. Thanks, the

[Bug libstdc++/88545] std::find compile to memchr in trivial random access cases (patch)

2024-07-01 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88545 --- Comment #12 from Tamar Christina --- I had a bug in the benchmark, I forgot to set taskset, These are the correct ones: ++---+-+-+ | NEEDLE | scalar 1x | vect| memchr | ++---+-+--

[Bug tree-optimization/115629] Inefficient if-convert of masked conditionals

2024-07-01 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115629 --- Comment #6 from Tamar Christina --- (In reply to rguent...@suse.de from comment #5) > > In this case, the second load is conditional on the first load mask, which > > means it's already done an AND. > > And crucially inverting it means you

[Bug c++/115623] ICE: Segmentation fault in finish_for_cond with novector and almost infinite loop

2024-07-04 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115623 Tamar Christina changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug fortran/90608] Inline non-scalar minloc/maxloc calls

2024-07-05 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90608 --- Comment #19 from Tamar Christina --- Hi Mikael, It looks like the last version of your patch already gets inlined in the call sites we cared about. Would it be possible for you to upstream it?

[Bug tree-optimization/104265] Missed vectorization in 526.blender_r

2024-07-05 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104265 --- Comment #4 from Tamar Christina --- (In reply to Richard Biener from comment #3) > Note the SLP discovery opportunity is from the "reduction" PHI to the > return which merges control flow to a zero/one flag. Right, so I get what you mean he

[Bug tree-optimization/104265] Missed vectorization in 526.blender_r

2024-07-05 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104265 --- Comment #5 from Tamar Christina --- Also for fully masked architectures we can instead of recreating the vectors just mask out the irrelevant values. But we should still order the exits based on complexity.

[Bug libstdc++/115799] ranges::find's optimized branching for memchr is not quite right

2024-07-06 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115799 Tamar Christina changed: What|Removed |Added CC||tnfchris at gcc dot gnu.org --- Comme

[Bug tree-optimization/115866] New: missed optimization vectorizing switch statements.

2024-07-10 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115866 Bug ID: 115866 Summary: missed optimization vectorizing switch statements. Product: gcc Version: 15.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal

[Bug tree-optimization/115866] missed optimization vectorizing switch statements.

2024-07-11 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115866 Tamar Christina changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot gnu.org -

[Bug target/115934] [15 Regression] nvptx vs. ivopts: replace constant_multiple_of with aff_combination_constant_multiple_p [PR114932]

2024-07-15 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115934 --- Comment #1 from Tamar Christina --- Hi, thanks for the report, could you tell me a target triple I can use for nvptx?

[Bug target/115934] [15 Regression] nvptx vs. ivopts: replace constant_multiple_of with aff_combination_constant_multiple_p [PR114932]

2024-07-15 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115934 Tamar Christina changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot gnu.org

[Bug target/115936] [15 Regression] GCN vs. ivopts: replace constant_multiple_of with aff_combination_constant_multiple_p [PR114932]

2024-07-15 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115936 Tamar Christina changed: What|Removed |Added Ever confirmed|0 |1 CC|

[Bug target/115936] [15 Regression] GCN vs. ivopts: replace constant_multiple_of with aff_combination_constant_multiple_p [PR114932]

2024-07-15 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115936 Tamar Christina changed: What|Removed |Added Target Milestone|--- |15.0 --- Comment #2 from Tamar Christ

[Bug target/115934] [15 Regression] nvptx vs. ivopts: replace constant_multiple_of with aff_combination_constant_multiple_p [PR114932]

2024-07-15 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115934 --- Comment #4 from Tamar Christina --- This one looks a bit like costing, before the patch IVopts had: : inv_expr 1: -element_7(D) inv_expr 2: (signed int) rite_5(D) - (signed int) element_7(D) and after the patch it generates a few

[Bug target/115936] [15 Regression] GCN vs. ivopts: replace constant_multiple_of with aff_combination_constant_multiple_p [PR114932]

2024-07-15 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115936 --- Comment #4 from Tamar Christina --- (In reply to Richard Biener from comment #3) > iv->step should never be a pointer type That's what I initially thought too. My suspicion is that there is some code that tries to create the 0 offset. I'l

[Bug target/115934] [15 Regression] nvptx vs. ivopts: replace constant_multiple_of with aff_combination_constant_multiple_p [PR114932]

2024-07-15 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115934 --- Comment #7 from Tamar Christina --- (In reply to Thomas Schwinge from comment #6) > Tamar, Richard, thanks for having a look. > > (In reply to Tamar Christina from comment #4) > > This one looks a bit like costing, [...] > > I see. So we

[Bug target/115936] [15 Regression] GCN vs. ivopts: replace constant_multiple_of with aff_combination_constant_multiple_p [PR114932]

2024-07-15 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115936 --- Comment #5 from Tamar Christina --- (In reply to Richard Biener from comment #3) > iv->step should never be a pointer type This is created by SCEV. simple_iv_with_niters in the case where no CHREC is found creates an IV with base == ev, of

[Bug target/115936] [15 Regression] GCN vs. ivopts: replace constant_multiple_of with aff_combination_constant_multiple_p [PR114932]

2024-07-15 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115936 --- Comment #6 from Tamar Christina --- (In reply to Richard Biener from comment #3) > iv->step should never be a pointer type This is created by SCEV. simple_iv_with_niters in the case where no CHREC is found creates an IV with base == ev, of

[Bug tree-optimization/115936] [15 Regression] GCN vs. ivopts: replace constant_multiple_of with aff_combination_constant_multiple_p [PR114932]

2024-07-17 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115936 Tamar Christina changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug target/102218] 128-bit atomic compare and exchange does not honor memory model on AArch64 and Arm

2022-12-21 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102218 --- Comment #4 from Tamar Christina --- (In reply to ktkachov from comment #3) > Does this need to be backported to other release versions as it's a > wrong-code bug? Yes Ideally. I did ask for backport but was only approved for master.

[Bug target/108172] [13 Regression] ICE in emit_move_multi_word, at expr.cc:4087 since r13-4616-g2cba118e538ba0

2023-01-06 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108172 Tamar Christina changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot gnu.org -

[Bug target/108172] [13 Regression] ICE in emit_move_multi_word, at expr.cc:4087 since r13-4616-g2cba118e538ba0

2023-01-06 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108172 Tamar Christina changed: What|Removed |Added Resolution|--- |FIXED Status|NEW

[Bug driver/97574] Allow for nul output with Windows

2023-01-18 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97574 Tamar Christina changed: What|Removed |Added Status|NEW |RESOLVED Resolution|---

[Bug ipa/107925] ICE in update_specialized_profile at gcc/ipa-cp.cc:5082 for 531.deepsjeng_r benchmark

2023-01-27 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107925 --- Comment #5 from Tamar Christina --- I seem to have the same failure in at least GCC 12 as well.

[Bug testsuite/108394] aarch64/cpunative/native_cpu_18.c fails in GCC 11 and 12

2023-01-27 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108394 Tamar Christina changed: What|Removed |Added CC||tnfchris at gcc dot gnu.org --- Comme

[Bug target/108583] [13 Regression] wrong code with vector division by uint16 at -O2

2023-01-30 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108583 Tamar Christina changed: What|Removed |Added CC||rguenth at gcc dot gnu.org,

[Bug target/108583] [13 Regression] wrong code with vector division by uint16 at -O2

2023-01-30 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108583 --- Comment #5 from Tamar Christina --- > > The vectorizer has this context but since we didn't want a new IFN the > > context should instead be derivable in > > targetm.vectorize.can_special_div_by_const hook. > > The vectorizer doesn't chec

[Bug tree-optimization/108601] New: [13 Regression] vector peeling ICEs with PGO + LTO + IPA inlining in gcc_r in SPEC2017

2023-01-30 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108601 Bug ID: 108601 Summary: [13 Regression] vector peeling ICEs with PGO + LTO + IPA inlining in gcc_r in SPEC2017 Product: gcc Version: 13.0 Status: UNCONFIRMED K

[Bug target/108583] [13 Regression] wrong code with vector division by uint16 at -O2

2023-01-30 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108583 --- Comment #7 from Tamar Christina --- (In reply to rsand...@gcc.gnu.org from comment #6) > (In reply to Tamar Christina from comment #3) > > The vectorizer has this context but since we didn't want a new IFN the > > context should instead be d

[Bug tree-optimization/108601] [13 Regression] vector peeling ICEs with PGO + LTO + IPA inlining in gcc_r in SPEC2017

2023-01-31 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108601 --- Comment #6 from Tamar Christina --- probably relevant that I can only reproduce it on an SVE/VLA system. non-VLA works fine. I have cvise running trying for a repro.

[Bug target/108583] [13 Regression] wrong code with vector division by uint16 at -O2

2023-01-31 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108583 --- Comment #10 from Tamar Christina --- (In reply to rsand...@gcc.gnu.org from comment #9) > Are we sure this is a vectoriser vs. C vectors thing? it's not, the issue we're debating is how to fix it. As Richi pointed out https://gcc.gnu.org/b

[Bug target/108583] [13 Regression] wrong code with vector division by uint16 at -O2

2023-01-31 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108583 --- Comment #12 from Tamar Christina --- (In reply to rguent...@suse.de from comment #11) > On Tue, 31 Jan 2023, tnfchris at gcc dot gnu.org wrote: > > > I don't think passing in for example the tree operand 0 helps, the > target appearantly w

[Bug target/108583] [13 Regression] wrong code with vector division by uint16 at -O2

2023-01-31 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108583 --- Comment #15 from Tamar Christina --- > OK, hopefully I understand now. Sorry for being slow. Not at all, Sorry if it came across a bit cranky, it wasn't meant that way! > If that's the condition we want to test for, it seems like somethin

[Bug target/108583] [13 Regression] wrong code with vector division by uint16 at -O2

2023-01-31 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108583 --- Comment #18 from Tamar Christina --- > > > > Ack, that also tracks with what I tried before, we don't indeed track ranges > > for vector ops. The general case can still be handled slightly better (I > > think) > > but it doesn't become as

[Bug tree-optimization/108601] [13 Regression] vector peeling ICEs with VLA in gcc_r in SPEC2017 since g:c13223b790bbc5e4a3f5605e057eac59b61b2c85

2023-01-31 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108601 Tamar Christina changed: What|Removed |Added Target||aarch64* Summary|[13 Regre

[Bug tree-optimization/108601] [13 Regression] vector peeling ICEs with VLA in gcc_r in SPEC2017 since g:c13223b790bbc5e4a3f5605e057eac59b61b2c85

2023-01-31 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108601 --- Comment #8 from Tamar Christina --- In case it helps, here's the reproducer on compiler explorer and the dump file https://godbolt.org/z/dWvqexjnv

[Bug target/108583] [13 Regression] wrong code with vector division by uint16 at -O2

2023-02-01 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108583 --- Comment #20 from Tamar Christina --- > > I don't think so for addhn, because it wouldn't truncate the top bits, it > > truncates the bottom bits. > > > > The instruction does > > element1 = Elem[operand1, e, 2*esize]; > > element2 =

[Bug target/108583] [13 Regression] wrong code with vector division by uint16 at -O2

2023-02-02 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108583 --- Comment #21 from Tamar Christina --- > > OK, so that's an ADD_HIGHPART_EXPR then? Though the highpart of an > add is only a single bit, isn't it? For scalar you'd use the > carry bit here and instructions like adc to consume it. Is addhn

[Bug target/108583] [13 Regression] wrong code with vector division by uint16 at -O2

2023-02-02 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108583 --- Comment #23 from Tamar Christina --- (In reply to rguent...@suse.de from comment #22) > On Thu, 2 Feb 2023, tnfchris at gcc dot gnu.org wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108583 > > > > --- Comment #21 from Tamar Chr

[Bug target/108583] [13 Regression] wrong code with vector division by uint16 at -O2

2023-02-08 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108583 --- Comment #24 from Tamar Christina --- > Sure that works I think, I'll do that then. Just to check, I'm regtesting the patch, I assume you want me to revert the hook as well right? Since nothing will be using it.

<    1   2   3   4   5   6   7   8   9   >