from:"tnfchris at gcc dot gnu.org via Gcc\-bugs"

[Bug rtl-optimization/113682] Branches in branchless binary search rather than cmov/csel/csinc

2024-04-03 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113682 --- Comment #9 from Tamar Christina --- (In reply to Andrew Pinski from comment #8) > This might be the path splitting running on the gimple level causing issues > too; see PR 112402 . Ah that's a good shout. It looks like Richi already

[Bug target/114577] New: Inefficient codegen for SVE/NEON bridge

2024-04-03 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114577 Bug ID: 114577 Summary: Inefficient codegen for SVE/NEON bridge Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal

[Bug target/114510] [14 Regression] missed proping of multiply by 2 into address of load/stores

2024-04-03 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114510 Tamar Christina changed: What|Removed |Added CC||tnfchris at gcc dot gnu.org ---

[Bug tree-optimization/114403] [14 regression] LLVM miscompiled with -O3 -march=znver2 -fno-vect-cost-model since r14-6822-g01f4251b8775c8

2024-04-02 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114403 --- Comment #20 from Tamar Christina --- This is a bad interaction with early break and peeling for gaps. when peeling for gaps we set bias_for_lowest to 0, which then negates the ceil for the upper bound calculation when the div is exact. We

[Bug tree-optimization/114403] [14 regression] LLVM miscompiled with -O3 -march=znver2 -fno-vect-cost-model since r14-6822-g01f4251b8775c8

2024-04-02 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114403 Tamar Christina changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed|

[Bug rtl-optimization/113682] Branches in branchless binary search rather than cmov/csel/csinc

2024-04-02 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113682 Tamar Christina changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever confirmed|0

[Bug tree-optimization/113441] [14 Regression] Fail to fold the last element with multiple loop since g:2efe3a7de0107618397264017fb045f237764cc7

2024-02-22 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441 Tamar Christina changed: What|Removed |Added Ever confirmed|0 |1 Summary|[14 Regression]

[Bug tree-optimization/114061] New: GCC fails vectorization when using __builtin_prefetch

2024-02-22 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114061 Bug ID: 114061 Summary: GCC fails vectorization when using __builtin_prefetch Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: missed-optimization Severity:

[Bug target/113257] -march=native or -mcpu=native are ineffective, but -march=native -mcpu=native works on arm64 M2 Ultra

2024-02-22 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113257 --- Comment #5 from Tamar Christina --- (In reply to Sam James from comment #3) > (In reply to Richard Earnshaw from comment #2) > I'm missing why the combination then works though? So we've made several changes here over time. -mcpu=native

[Bug tree-optimization/114061] GCC fails vectorization when using __builtin_prefetch

2024-02-22 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114061 --- Comment #2 from Tamar Christina --- (In reply to Andrew Pinski from comment #1) > I thought there was already one recorded about this. I could only find https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103938 about an ICE when prefetching a

[Bug tree-optimization/114061] GCC fails vectorization when using __builtin_prefetch

2024-02-22 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114061 --- Comment #4 from Tamar Christina --- (In reply to Andrew Pinski from comment #3) > Confirmed. > > Though maybe we should drop them in the vectorized version of the loop. HW > prefetchers usually do a decent job and sometimes (maybe most) SW

[Bug target/114063] New: Use IFN_CHECK_RAW_PTRS/IFN_CHECK_WAR_PTRS for Advanced. SIMD

2024-02-22 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114063 Bug ID: 114063 Summary: Use IFN_CHECK_RAW_PTRS/IFN_CHECK_WAR_PTRS for Advanced. SIMD Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords:

[Bug tree-optimization/114068] [14 regression] ICE when building darktable-4.6.1 (error: PHI node with wrong VUSE on edge from BB 25) since r14-8768

2024-02-23 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114068 --- Comment #12 from Tamar Christina --- looks like the moving of the store didn't update a stray out of block use of the MEM. working on patch.

[Bug target/102171] vget_low_/vget_high_ intrinsics should become BIT_FIELD_REF during gimple

2024-02-27 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102171 --- Comment #3 from Tamar Christina --- (In reply to Andrew Pinski from comment #2) > I think I am going to implement this (or assign it interally to someone else > to implement). If you do, please also remove them from arm_neon.h and use the

[Bug tree-optimization/86530] Vectorization failure for a simple loop

2024-02-26 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86530 --- Comment #8 from Tamar Christina --- (In reply to Andrew Pinski from comment #6) > With my patch for V4QI, we still don't get the best code: > vect_perm_even_271 = VEC_PERM_EXPR 4, 6 }>; > vect_perm_even_273 = VEC_PERM_EXPR 4, 6 }>; >

[Bug tree-optimization/113441] [14 Regression] Fail to fold the last element with multiple loop since g:2efe3a7de0107618397264017fb045f237764cc7

2024-02-27 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441 Tamar Christina changed: What|Removed |Added CC||rsandifo at gcc dot gnu.org ---

[Bug tree-optimization/114099] [14 regression] ICE in find_uses_to_rename_use when building darktable-4.6.1

2024-02-25 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114099 --- Comment #8 from Tamar Christina --- Created attachment 57537 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57537=edit uses.patch new code seems sensitive to visitation order as get_virtual_phi returns NULL for blocks which don't

[Bug tree-optimization/113441] [14 Regression] Fail to fold the last element with multiple loop since g:2efe3a7de0107618397264017fb045f237764cc7

2024-02-26 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441 --- Comment #27 from Tamar Christina --- Created attachment 57538 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57538=edit proposed1.patch proposed patch, this gets the gathers and scatters back. doing regression run.

[Bug tree-optimization/114192] scalar code left around following early break vectorization of reduction

2024-03-01 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114192 Tamar Christina changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED

[Bug target/98877] [AArch64] Inefficient code generated for tbl NEON intrinsics

2024-02-28 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98877 --- Comment #9 from Tamar Christina --- While RA should be able to deal with this, shouldn't we also just lower TBLs in gimple? This no reason why this can't be a VEC_PERM_EXPR which would also get the copies removed at the gimple level and

[Bug tree-optimization/114151] New: [14 Regression] weird and inefficient codegen and addressing modes since g:a0b1798042d033fd2cc2c806afbb77875dd2909b

2024-02-28 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114151 Bug ID: 114151 Summary: [14 Regression] weird and inefficient codegen and addressing modes since g:a0b1798042d033fd2cc2c806afbb77875dd2909b Product: gcc

[Bug tree-optimization/114151] [14 Regression] weird and inefficient codegen and addressing modes since g:a0b1798042d033fd2cc2c806afbb77875dd2909b

2024-02-28 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114151 --- Comment #3 from Tamar Christina --- > > This was a correctness fix btw, so I'm not sure we can easily recover - we > could try using niter information for CHREC_VARIABLE but then there's > variable niter here so I don't see a chance. >

[Bug target/98877] [AArch64] Inefficient code generated for tbl NEON intrinsics

2024-02-28 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98877 --- Comment #11 from Tamar Christina --- (In reply to Andrew Pinski from comment #10) > (In reply to Tamar Christina from comment #9) > > While RA should be able to deal with this, > > shouldn't we also just lower TBLs in gimple? > > > > This

[Bug target/98877] [AArch64] Inefficient code generated for tbl NEON intrinsics

2024-02-28 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98877 --- Comment #12 from Tamar Christina --- and it's not the first time we have conditional lowering. We already do so for e.g. shifts, where shifting by an amount => bitsize of a vector element is defined behavior or AArch64.

[Bug tree-optimization/114234] [14 Regression] verify_ssa failure with early-break vectorisation

2024-03-05 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114234 Tamar Christina changed: What|Removed |Added Last reconfirmed||2024-03-05

[Bug tree-optimization/114068] [14 regression] ICE when building darktable-4.6.1 (error: PHI node with wrong VUSE on edge from BB 25) since r14-8768

2024-02-23 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114068 Tamar Christina changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot gnu.org

[Bug tree-optimization/114068] [14 regression] ICE when building darktable-4.6.1 (error: PHI node with wrong VUSE on edge from BB 25) since r14-8768

2024-02-23 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114068 --- Comment #13 from Tamar Christina --- Created attachment 57510 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57510=edit candidate-patch1.patch candidate patch being tested. I was hoping to correct it during peeling itself when the

[Bug tree-optimization/115120] Bad interaction between ivcanon and early break vectorization

2024-05-17 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115120 --- Comment #3 from Tamar Christina --- That makes sense, though I also wonder how it works for scalar multi exit loops, IVops has various checks on single exits. I guess one problem is that the code in IVops that does this uses the exit to

[Bug tree-optimization/115130] New: (early-break) [meta-bug] early break vectorization

2024-05-17 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115130 Bug ID: 115130 Summary: (early-break) [meta-bug] early break vectorization Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: meta-bug, missed-optimization

[Bug tree-optimization/115130] (early-break) [meta-bug] early break vectorization

2024-05-17 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115130 Tamar Christina changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed|

[Bug tree-optimization/114635] OpenMP reductions fail dependency analysis

2024-04-08 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635 --- Comment #6 from Tamar Christina --- (In reply to Jakub Jelinek from comment #4) > Now, with SVE/RISCV vectors the actual vectorization factor is a poly_int > rather than constant. One possibility would be to use VLA arrays in those >

[Bug tree-optimization/114635] New: OpenMP reductions fail dependency analysis

2024-04-08 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635 Bug ID: 114635 Summary: OpenMP reductions fail dependency analysis Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal

[Bug target/114860] [14/15 regression] [aarch64] 511.povray regresses by ~5.5% with -O3 -flto -march=native -mcpu=neoverse-v2 since r14-10014-ga2f4be3dae04fa

2024-05-22 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114860 --- Comment #9 from Tamar Christina --- (In reply to prathamesh3492 from comment #8) > Hi Tamar, > Using -falign-loops=5 indeed brings back the performance. > The adrp instruction has same address (0x4ae784) by setting -falign-loops=5 > (which

[Bug tree-optimization/114932] IVopts inefficient handling of signed IV used for addressing.

2024-06-05 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932 --- Comment #11 from Tamar Christina --- (In reply to Richard Biener from comment #10) > I think the question is why IVOPTs ends up using both the signed and > unsigned variant of the same IV instead of expressing all uses of both with > one

[Bug tree-optimization/54013] Loop with control flow not vectorized

2024-06-05 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54013 Tamar Christina changed: What|Removed |Added Blocks||115130 --- Comment #4 from Tamar

[Bug tree-optimization/114932] IVopts inefficient handling of signed IV used for addressing.

2024-06-06 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932 --- Comment #13 from Tamar Christina --- (In reply to rguent...@suse.de from comment #12) > > since we don't care about overflow here, it looks like the stripping should > > be recursive as long as it's a NOP expression between two integral

[Bug tree-optimization/114932] IVopts inefficient handling of signed IV used for addressing.

2024-06-06 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932 --- Comment #15 from Tamar Christina --- (In reply to rguent...@suse.de from comment #14) > On Thu, 6 Jun 2024, tnfchris at gcc dot gnu.org wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932 > > > > --- Comment #13 from Tamar

[Bug target/115464] [14 Backport] ICE when building libaom on arm64 (neon sve bridge usage with tbl/perm)

2024-06-13 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115464 --- Comment #10 from Tamar Christina --- Thanks for the fix, but I don't think it's sufficient. what I meant with the earlier comment was that the subregs are broken in general, so not just the one generated by the undef fast path. i.e.

[Bug tree-optimization/115531] vectorizer generates inefficient code for masked conditional update loops

2024-06-17 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115531 --- Comment #3 from Tamar Christina --- (In reply to Andrew Pinski from comment #1) > I suspect PR 20999 would fix this ... > but we have to be careful since without masked stores, you could still > vectorize this unlike the transformed

[Bug tree-optimization/115531] New: vectorizer generates inefficient code for masked conditional update loops

2024-06-17 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115531 Bug ID: 115531 Summary: vectorizer generates inefficient code for masked conditional update loops Product: gcc Version: 15.0 Status: UNCONFIRMED Keywords:

[Bug tree-optimization/115534] New: intermediate stack use not eliminated

2024-06-18 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115534 Bug ID: 115534 Summary: intermediate stack use not eliminated Product: gcc Version: 15.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal

[Bug middle-end/115534] intermediate stack use not eliminated

2024-06-18 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115534 --- Comment #2 from Tamar Christina --- (In reply to Andrew Pinski from comment #1) > I suspect there is a dup of this already. See the bug which I made this one > blocking for a list of related bugs. Most of the other bugs relate to the

[Bug tree-optimization/115537] New: [15 Regression] vectorizable_reduction ICEs after g:d66b820f392aa9a7c34d3cddaf3d7c73bf23f82d

2024-06-18 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115537 Bug ID: 115537 Summary: [15 Regression] vectorizable_reduction ICEs after g:d66b820f392aa9a7c34d3cddaf3d7c73bf23f82d Product: gcc Version: 15.0 Status: UNCONFIRMED

[Bug tree-optimization/115537] [15 Regression] vectorizable_reduction ICEs after g:d66b820f392aa9a7c34d3cddaf3d7c73bf23f82d

2024-06-18 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115537 --- Comment #5 from Tamar Christina --- Thanks for the fix! I think the testcase needs SVE enabled to ICE no? shouldn't that be -mcpu=neoverse-v1 and not -mcpu=neoverse-n1?

[Bug middle-end/115534] intermediate stack use not eliminated

2024-06-18 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115534 --- Comment #5 from Tamar Christina --- (In reply to Andrew Pinski from comment #4) > This might be improved by > https://gcc.gnu.org/pipermail/gcc-patches/2024-June/654819.html . Or it > might be the case the vectorizer case needs to be

[Bug tree-optimization/114932] IVopts inefficient handling of signed IV used for addressing.

2024-06-05 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932 --- Comment #9 from Tamar Christina --- It's taken me a bit of time to track down all the reasons for the speedup with the earlier patch. This comes from two parts: 1. Signed IVs don't get simplified. Due to possible UB with signed overflows

[Bug target/115464] ICE when building libaom on arm64 (neon sve bridge usage with tbl/perm)

2024-06-12 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115464 Tamar Christina changed: What|Removed |Added CC||rsandifo at gcc dot gnu.org ---

[Bug target/115464] ICE when building libaom on arm64 (neon sve bridge usage with tbl/perm)

2024-06-12 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115464 Tamar Christina changed: What|Removed |Added Last reconfirmed||2024-06-12 CC|

[Bug target/115464] ICE when building libaom on arm64 (neon sve bridge usage with tbl/perm)

2024-06-12 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115464 --- Comment #7 from Tamar Christina --- (In reply to Tamar Christina from comment #6) > (In reply to Richard Sandiford from comment #5) > > In this kind of situation, we should go through a fresh pseudo rather than > > try to take the subreg

[Bug target/115464] ICE when building libaom on arm64 (neon sve bridge usage with tbl/perm)

2024-06-12 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115464 --- Comment #6 from Tamar Christina --- (In reply to Richard Sandiford from comment #5) > In this kind of situation, we should go through a fresh pseudo rather than > try to take the subreg directly. I did try that but fwprop pushed it back

[Bug tree-optimization/115629] Inefficient if-convert of masked conditionals

2024-07-01 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115629 --- Comment #4 from Tamar Christina --- (In reply to Richard Biener from comment #3) > So we now tail-merge the two b[i] loading blocks. Can you check SVE > code-gen with this? If that fixes the PR consider adding a SVE testcase. Thanks, the

[Bug libstdc++/88545] std::find compile to memchr in trivial random access cases (patch)

2024-07-01 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88545 --- Comment #12 from Tamar Christina --- I had a bug in the benchmark, I forgot to set taskset, These are the correct ones: ++---+-+-+ | NEEDLE | scalar 1x | vect| memchr |

[Bug libstdc++/88545] std::find compile to memchr in trivial random access cases (patch)

2024-07-01 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88545 --- Comment #11 from Tamar Christina --- (In reply to Jonathan Wakely from comment #9) > Patch posted: https://gcc.gnu.org/pipermail/gcc-patches/2024-June/653731.html > > Rerunning benchmarks with this patch would be very welcome. OK, I have

[Bug tree-optimization/115629] Inefficient if-convert of masked conditionals

2024-07-01 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115629 --- Comment #6 from Tamar Christina --- (In reply to rguent...@suse.de from comment #5) > > In this case, the second load is conditional on the first load mask, which > > means it's already done an AND. > > And crucially inverting it means you

[Bug middle-end/115597] New: [15 Regression] vectorizer takes 20+ h compiling 510.parest in SPECCPU2017 since g:46bb4ce4d30ab749d40f6f4cef6f1fb7c7813452

2024-06-23 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115597 Bug ID: 115597 Summary: [15 Regression] vectorizer takes 20+ h compiling 510.parest in SPECCPU2017 since g:46bb4ce4d30ab749d40f6f4cef6f1fb7c7813452 Product: gcc

[Bug middle-end/115597] [15 Regression] vectorizer takes 20+ h compiling 510.parest in SPECCPU2017 since g:46bb4ce4d30ab749d40f6f4cef6f1fb7c7813452

2024-06-23 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115597 --- Comment #4 from Tamar Christina --- (In reply to Richard Biener from comment #2) > Ah, I feared this would happen - this case seems to be because of a lot of > VEC_PERM nodes(?) which are not handled by the CSE process as well as the >

[Bug middle-end/115597] [15 Regression] vectorizer takes 20+ h compiling 510.parest in SPECCPU2017 since g:46bb4ce4d30ab749d40f6f4cef6f1fb7c7813452

2024-06-23 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115597 --- Comment #3 from Tamar Christina --- > > Can you check whether that fixes the issue? > > diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc > index 9465d94de1a..212d5f97f7d 100644 > --- a/gcc/tree-vect-slp.cc > +++

[Bug tree-optimization/115120] Bad interaction between ivcanon and early break vectorization

2024-06-24 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115120 --- Comment #4 from Tamar Christina --- You asked why this doesn't happen with a normal vector loop Richi. For a normal loop when IVcannon adds the downward counting loop there are two main differences. 1. for a single exit loop, the downward

[Bug fortran/90608] Inline non-scalar minloc/maxloc calls

2024-07-05 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90608 --- Comment #19 from Tamar Christina --- Hi Mikael, It looks like the last version of your patch already gets inlined in the call sites we cared about. Would it be possible for you to upstream it?

[Bug tree-optimization/104265] Missed vectorization in 526.blender_r

2024-07-05 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104265 --- Comment #4 from Tamar Christina --- (In reply to Richard Biener from comment #3) > Note the SLP discovery opportunity is from the "reduction" PHI to the > return which merges control flow to a zero/one flag. Right, so I get what you mean

[Bug tree-optimization/104265] Missed vectorization in 526.blender_r

2024-07-05 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104265 --- Comment #5 from Tamar Christina --- Also for fully masked architectures we can instead of recreating the vectors just mask out the irrelevant values. But we should still order the exits based on complexity.

[Bug libstdc++/115799] ranges::find's optimized branching for memchr is not quite right

2024-07-06 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115799 Tamar Christina changed: What|Removed |Added CC||tnfchris at gcc dot gnu.org ---

[Bug tree-optimization/115866] New: missed optimization vectorizing switch statements.

2024-07-11 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115866 Bug ID: 115866 Summary: missed optimization vectorizing switch statements. Product: gcc Version: 15.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal

[Bug tree-optimization/115866] missed optimization vectorizing switch statements.

2024-07-11 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115866 Tamar Christina changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot gnu.org

[Bug tree-optimization/115629] New: Inefficient if-convert of masked conditionals

2024-06-24 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115629 Bug ID: 115629 Summary: Inefficient if-convert of masked conditionals Product: gcc Version: 15.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal

[Bug c++/115623] ICE: Segmentation fault in finish_for_cond with novector and almost infinite loop

2024-06-24 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115623 --- Comment #4 from Tamar Christina --- novect3.c: In function 'void f(char*, int)': novect3.c:4:9: error: missing loop condition in loop with 'GCC novector' pragma before ';' token 4 | for (;;i++) | should do it, will

[Bug c++/115623] ICE: Segmentation fault in finish_for_cond with novector and almost infinite loop

2024-06-24 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115623 Tamar Christina changed: What|Removed |Added Status|NEW |ASSIGNED

[Bug tree-optimization/115531] vectorizer generates inefficient code for masked conditional update loops

2024-06-24 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115531 Tamar Christina changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot gnu.org

[Bug tree-optimization/115120] Bad interaction between ivcanon and early break vectorization

2024-06-25 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115120 --- Comment #5 from Tamar Christina --- considering ivopts bails out on doloop prediction for multiple exits anyway, what do you think about: diff --git a/gcc/tree-ssa-loop-ivcanon.cc b/gcc/tree-ssa-loop-ivcanon.cc index

[Bug c++/115623] ICE: Segmentation fault in finish_for_cond with novector and almost infinite loop

2024-07-04 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115623 Tamar Christina changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug tree-optimization/115936] [15 Regression] GCN vs. ivopts: replace constant_multiple_of with aff_combination_constant_multiple_p [PR114932]

2024-07-17 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115936 Tamar Christina changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug target/115934] [15 Regression] nvptx vs. ivopts: replace constant_multiple_of with aff_combination_constant_multiple_p [PR114932]

2024-07-15 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115934 --- Comment #4 from Tamar Christina --- This one looks a bit like costing, before the patch IVopts had: : inv_expr 1: -element_7(D) inv_expr 2: (signed int) rite_5(D) - (signed int) element_7(D) and after the patch it generates a few

[Bug target/115936] [15 Regression] GCN vs. ivopts: replace constant_multiple_of with aff_combination_constant_multiple_p [PR114932]

2024-07-15 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115936 --- Comment #4 from Tamar Christina --- (In reply to Richard Biener from comment #3) > iv->step should never be a pointer type That's what I initially thought too. My suspicion is that there is some code that tries to create the 0 offset.

[Bug target/115936] [15 Regression] GCN vs. ivopts: replace constant_multiple_of with aff_combination_constant_multiple_p [PR114932]

2024-07-15 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115936 Tamar Christina changed: What|Removed |Added Ever confirmed|0 |1 CC|

[Bug target/115936] [15 Regression] GCN vs. ivopts: replace constant_multiple_of with aff_combination_constant_multiple_p [PR114932]

2024-07-15 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115936 Tamar Christina changed: What|Removed |Added Target Milestone|--- |15.0 --- Comment #2 from Tamar

[Bug target/115934] [15 Regression] nvptx vs. ivopts: replace constant_multiple_of with aff_combination_constant_multiple_p [PR114932]

2024-07-15 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115934 Tamar Christina changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot gnu.org

[Bug target/115934] [15 Regression] nvptx vs. ivopts: replace constant_multiple_of with aff_combination_constant_multiple_p [PR114932]

2024-07-15 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115934 --- Comment #1 from Tamar Christina --- Hi, thanks for the report, could you tell me a target triple I can use for nvptx?

[Bug target/115934] [15 Regression] nvptx vs. ivopts: replace constant_multiple_of with aff_combination_constant_multiple_p [PR114932]

2024-07-15 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115934 --- Comment #7 from Tamar Christina --- (In reply to Thomas Schwinge from comment #6) > Tamar, Richard, thanks for having a look. > > (In reply to Tamar Christina from comment #4) > > This one looks a bit like costing, [...] > > I see. So we

[Bug target/115936] [15 Regression] GCN vs. ivopts: replace constant_multiple_of with aff_combination_constant_multiple_p [PR114932]

2024-07-15 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115936 --- Comment #5 from Tamar Christina --- (In reply to Richard Biener from comment #3) > iv->step should never be a pointer type This is created by SCEV. simple_iv_with_niters in the case where no CHREC is found creates an IV with base == ev,

[Bug target/115936] [15 Regression] GCN vs. ivopts: replace constant_multiple_of with aff_combination_constant_multiple_p [PR114932]

2024-07-15 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115936 --- Comment #6 from Tamar Christina --- (In reply to Richard Biener from comment #3) > iv->step should never be a pointer type This is created by SCEV. simple_iv_with_niters in the case where no CHREC is found creates an IV with base == ev,

[Bug tree-optimization/53947] [meta-bug] vectorizer missed-optimizations

2024-07-22 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 Bug 53947 depends on bug 115531, which changed state. Bug 115531 Summary: vectorizer generates inefficient code for masked conditional update loops https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115531 What|Removed

[Bug tree-optimization/115531] vectorizer generates inefficient code for masked conditional update loops

2024-07-22 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115531 Tamar Christina changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug fortran/90608] Inline non-scalar minloc/maxloc calls

2024-07-23 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90608 --- Comment #20 from Tamar Christina --- Hi Mikael, I did regression testing on x86_64 and AArch64 and only found one test-ism. I think I understand most of the patch to be able to deal with any fallout, would it be ok if I fix the test-ism

[Bug ipa/106783] [12/13/14/15 Regression] ICE in ipa-modref.cc:analyze_function since r12-5247-ga34edf9a3e907de2

2024-07-23 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106783 --- Comment #8 from Tamar Christina --- (In reply to Jan Hubicka from comment #6) > The problem is that n/=0 is undefined behavior (so we can optimize out call > to function doing divide by zero), while __builtin_trap is observable and we > do

[Bug fortran/90608] Inline non-scalar minloc/maxloc calls

2024-07-24 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90608 --- Comment #22 from Tamar Christina --- (In reply to Mikael Morin from comment #21) > (In reply to Tamar Christina from comment #20) > > Hi Mikael, > > > > I did regression testing on x86_64 and AArch64 and only found one test-ism. > > > > I

[Bug tree-optimization/116074] [15 regression] ICE when building harfbuzz-9.0.0 on arm64 (related_int_vector_mode, at stor-layout.cc:581)

2024-07-25 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116074 --- Comment #7 from Tamar Christina --- The backend is returning TImode for get_vectype_for_scalar_type for historical reasons where large integer modes were considered struct types and this vector modes. However they're not modes the

[Bug tree-optimization/116074] [15 regression] ICE when building harfbuzz-9.0.0 on arm64 (related_int_vector_mode, at stor-layout.cc:581)

2024-07-25 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116074 Tamar Christina changed: What|Removed |Added Last reconfirmed||2024-07-25

[Bug tree-optimization/116074] [15 regression] ICE when building harfbuzz-9.0.0 on arm64 (related_int_vector_mode, at stor-layout.cc:581)

2024-07-25 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116074 --- Comment #8 from Tamar Christina --- Going with a backend fix instead.

[Bug target/116074] [15 regression] ICE when building harfbuzz-9.0.0 on arm64 (related_int_vector_mode, at stor-layout.cc:581)

2024-07-26 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116074 Tamar Christina changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug tree-optimization/53947] [meta-bug] vectorizer missed-optimizations

2024-07-26 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 Bug 53947 depends on bug 116074, which changed state. Bug 116074 Summary: [15 regression] ICE when building harfbuzz-9.0.0 on arm64 (related_int_vector_mode, at stor-layout.cc:581) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116074

< 3 4 5 6 7 8

701 - 790 of 790 matches

Mail list logo