https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113682
--- Comment #9 from Tamar Christina ---
(In reply to Andrew Pinski from comment #8)
> This might be the path splitting running on the gimple level causing issues
> too; see PR 112402 .
Ah that's a good shout. It looks like Richi already
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114577
Bug ID: 114577
Summary: Inefficient codegen for SVE/NEON bridge
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114510
Tamar Christina changed:
What|Removed |Added
CC||tnfchris at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114403
--- Comment #20 from Tamar Christina ---
This is a bad interaction with early break and peeling for gaps.
when peeling for gaps we set bias_for_lowest to 0, which then negates the ceil
for the upper bound calculation when the div is exact.
We
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114403
Tamar Christina changed:
What|Removed |Added
Status|UNCONFIRMED |ASSIGNED
Last reconfirmed|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113682
Tamar Christina changed:
What|Removed |Added
Status|UNCONFIRMED |NEW
Ever confirmed|0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441
Tamar Christina changed:
What|Removed |Added
Ever confirmed|0 |1
Summary|[14 Regression]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114061
Bug ID: 114061
Summary: GCC fails vectorization when using __builtin_prefetch
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113257
--- Comment #5 from Tamar Christina ---
(In reply to Sam James from comment #3)
> (In reply to Richard Earnshaw from comment #2)
> I'm missing why the combination then works though?
So we've made several changes here over time.
-mcpu=native
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114061
--- Comment #2 from Tamar Christina ---
(In reply to Andrew Pinski from comment #1)
> I thought there was already one recorded about this.
I could only find https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103938 about an
ICE when prefetching a
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114061
--- Comment #4 from Tamar Christina ---
(In reply to Andrew Pinski from comment #3)
> Confirmed.
>
> Though maybe we should drop them in the vectorized version of the loop. HW
> prefetchers usually do a decent job and sometimes (maybe most) SW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114063
Bug ID: 114063
Summary: Use IFN_CHECK_RAW_PTRS/IFN_CHECK_WAR_PTRS for
Advanced. SIMD
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Keywords:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114068
--- Comment #12 from Tamar Christina ---
looks like the moving of the store didn't update a stray out of block use of
the MEM.
working on patch.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102171
--- Comment #3 from Tamar Christina ---
(In reply to Andrew Pinski from comment #2)
> I think I am going to implement this (or assign it interally to someone else
> to implement).
If you do, please also remove them from arm_neon.h and use the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86530
--- Comment #8 from Tamar Christina ---
(In reply to Andrew Pinski from comment #6)
> With my patch for V4QI, we still don't get the best code:
> vect_perm_even_271 = VEC_PERM_EXPR 4, 6 }>;
> vect_perm_even_273 = VEC_PERM_EXPR 4, 6 }>;
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441
Tamar Christina changed:
What|Removed |Added
CC||rsandifo at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114099
--- Comment #8 from Tamar Christina ---
Created attachment 57537
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57537=edit
uses.patch
new code seems sensitive to visitation order as get_virtual_phi returns NULL
for blocks which don't
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441
--- Comment #27 from Tamar Christina ---
Created attachment 57538
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57538=edit
proposed1.patch
proposed patch, this gets the gathers and scatters back. doing regression run.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114192
Tamar Christina changed:
What|Removed |Added
Ever confirmed|0 |1
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98877
--- Comment #9 from Tamar Christina ---
While RA should be able to deal with this,
shouldn't we also just lower TBLs in gimple?
This no reason why this can't be a VEC_PERM_EXPR which would also get the
copies
removed at the gimple level and
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114151
Bug ID: 114151
Summary: [14 Regression] weird and inefficient codegen and
addressing modes since
g:a0b1798042d033fd2cc2c806afbb77875dd2909b
Product: gcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114151
--- Comment #3 from Tamar Christina ---
>
> This was a correctness fix btw, so I'm not sure we can easily recover - we
> could try using niter information for CHREC_VARIABLE but then there's
> variable niter here so I don't see a chance.
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98877
--- Comment #11 from Tamar Christina ---
(In reply to Andrew Pinski from comment #10)
> (In reply to Tamar Christina from comment #9)
> > While RA should be able to deal with this,
> > shouldn't we also just lower TBLs in gimple?
> >
> > This
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98877
--- Comment #12 from Tamar Christina ---
and it's not the first time we have conditional lowering. We already do so for
e.g. shifts, where shifting by an amount => bitsize of a vector element is
defined behavior or AArch64.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114234
Tamar Christina changed:
What|Removed |Added
Last reconfirmed||2024-03-05
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114068
Tamar Christina changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114068
--- Comment #13 from Tamar Christina ---
Created attachment 57510
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57510=edit
candidate-patch1.patch
candidate patch being tested.
I was hoping to correct it during peeling itself when the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115120
--- Comment #3 from Tamar Christina ---
That makes sense, though I also wonder how it works for scalar multi exit
loops, IVops has various checks on single exits.
I guess one problem is that the code in IVops that does this uses the exit to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115130
Bug ID: 115130
Summary: (early-break) [meta-bug] early break vectorization
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Keywords: meta-bug, missed-optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115130
Tamar Christina changed:
What|Removed |Added
Ever confirmed|0 |1
Last reconfirmed|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635
--- Comment #6 from Tamar Christina ---
(In reply to Jakub Jelinek from comment #4)
> Now, with SVE/RISCV vectors the actual vectorization factor is a poly_int
> rather than constant. One possibility would be to use VLA arrays in those
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635
Bug ID: 114635
Summary: OpenMP reductions fail dependency analysis
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114860
--- Comment #9 from Tamar Christina ---
(In reply to prathamesh3492 from comment #8)
> Hi Tamar,
> Using -falign-loops=5 indeed brings back the performance.
> The adrp instruction has same address (0x4ae784) by setting -falign-loops=5
> (which
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932
--- Comment #11 from Tamar Christina ---
(In reply to Richard Biener from comment #10)
> I think the question is why IVOPTs ends up using both the signed and
> unsigned variant of the same IV instead of expressing all uses of both with
> one
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54013
Tamar Christina changed:
What|Removed |Added
Blocks||115130
--- Comment #4 from Tamar
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932
--- Comment #13 from Tamar Christina ---
(In reply to rguent...@suse.de from comment #12)
> > since we don't care about overflow here, it looks like the stripping should
> > be recursive as long as it's a NOP expression between two integral
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932
--- Comment #15 from Tamar Christina ---
(In reply to rguent...@suse.de from comment #14)
> On Thu, 6 Jun 2024, tnfchris at gcc dot gnu.org wrote:
>
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932
> >
> > --- Comment #13 from Tamar
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115464
--- Comment #10 from Tamar Christina ---
Thanks for the fix, but I don't think it's sufficient.
what I meant with the earlier comment was that the subregs are broken in
general, so not just the one generated by the undef fast path.
i.e.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115531
--- Comment #3 from Tamar Christina ---
(In reply to Andrew Pinski from comment #1)
> I suspect PR 20999 would fix this ...
> but we have to be careful since without masked stores, you could still
> vectorize this unlike the transformed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115531
Bug ID: 115531
Summary: vectorizer generates inefficient code for masked
conditional update loops
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Keywords:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115534
Bug ID: 115534
Summary: intermediate stack use not eliminated
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115534
--- Comment #2 from Tamar Christina ---
(In reply to Andrew Pinski from comment #1)
> I suspect there is a dup of this already. See the bug which I made this one
> blocking for a list of related bugs.
Most of the other bugs relate to the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115537
Bug ID: 115537
Summary: [15 Regression] vectorizable_reduction ICEs after
g:d66b820f392aa9a7c34d3cddaf3d7c73bf23f82d
Product: gcc
Version: 15.0
Status: UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115537
--- Comment #5 from Tamar Christina ---
Thanks for the fix!
I think the testcase needs SVE enabled to ICE no?
shouldn't that be -mcpu=neoverse-v1 and not -mcpu=neoverse-n1?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115534
--- Comment #5 from Tamar Christina ---
(In reply to Andrew Pinski from comment #4)
> This might be improved by
> https://gcc.gnu.org/pipermail/gcc-patches/2024-June/654819.html . Or it
> might be the case the vectorizer case needs to be
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932
--- Comment #9 from Tamar Christina ---
It's taken me a bit of time to track down all the reasons for the speedup with
the earlier patch.
This comes from two parts:
1. Signed IVs don't get simplified. Due to possible UB with signed overflows
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115464
Tamar Christina changed:
What|Removed |Added
CC||rsandifo at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115464
Tamar Christina changed:
What|Removed |Added
Last reconfirmed||2024-06-12
CC|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115464
--- Comment #7 from Tamar Christina ---
(In reply to Tamar Christina from comment #6)
> (In reply to Richard Sandiford from comment #5)
> > In this kind of situation, we should go through a fresh pseudo rather than
> > try to take the subreg
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115464
--- Comment #6 from Tamar Christina ---
(In reply to Richard Sandiford from comment #5)
> In this kind of situation, we should go through a fresh pseudo rather than
> try to take the subreg directly.
I did try that but fwprop pushed it back
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115629
--- Comment #4 from Tamar Christina ---
(In reply to Richard Biener from comment #3)
> So we now tail-merge the two b[i] loading blocks. Can you check SVE
> code-gen with this? If that fixes the PR consider adding a SVE testcase.
Thanks, the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88545
--- Comment #12 from Tamar Christina ---
I had a bug in the benchmark, I forgot to set taskset,
These are the correct ones:
++---+-+-+
| NEEDLE | scalar 1x | vect| memchr |
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88545
--- Comment #11 from Tamar Christina ---
(In reply to Jonathan Wakely from comment #9)
> Patch posted: https://gcc.gnu.org/pipermail/gcc-patches/2024-June/653731.html
>
> Rerunning benchmarks with this patch would be very welcome.
OK, I have
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115629
--- Comment #6 from Tamar Christina ---
(In reply to rguent...@suse.de from comment #5)
> > In this case, the second load is conditional on the first load mask, which
> > means it's already done an AND.
> > And crucially inverting it means you
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115597
Bug ID: 115597
Summary: [15 Regression] vectorizer takes 20+ h compiling
510.parest in SPECCPU2017 since
g:46bb4ce4d30ab749d40f6f4cef6f1fb7c7813452
Product: gcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115597
--- Comment #4 from Tamar Christina ---
(In reply to Richard Biener from comment #2)
> Ah, I feared this would happen - this case seems to be because of a lot of
> VEC_PERM nodes(?) which are not handled by the CSE process as well as the
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115597
--- Comment #3 from Tamar Christina ---
>
> Can you check whether that fixes the issue?
>
> diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> index 9465d94de1a..212d5f97f7d 100644
> --- a/gcc/tree-vect-slp.cc
> +++
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115120
--- Comment #4 from Tamar Christina ---
You asked why this doesn't happen with a normal vector loop Richi.
For a normal loop when IVcannon adds the downward counting loop there are two
main differences.
1. for a single exit loop, the downward
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90608
--- Comment #19 from Tamar Christina ---
Hi Mikael,
It looks like the last version of your patch already gets inlined in the call
sites we cared about.
Would it be possible for you to upstream it?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104265
--- Comment #4 from Tamar Christina ---
(In reply to Richard Biener from comment #3)
> Note the SLP discovery opportunity is from the "reduction" PHI to the
> return which merges control flow to a zero/one flag.
Right, so I get what you mean
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104265
--- Comment #5 from Tamar Christina ---
Also for fully masked architectures we can instead of recreating the vectors
just mask out the irrelevant values.
But we should still order the exits based on complexity.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115799
Tamar Christina changed:
What|Removed |Added
CC||tnfchris at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115866
Bug ID: 115866
Summary: missed optimization vectorizing switch statements.
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115866
Tamar Christina changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115629
Bug ID: 115629
Summary: Inefficient if-convert of masked conditionals
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115623
--- Comment #4 from Tamar Christina ---
novect3.c: In function 'void f(char*, int)':
novect3.c:4:9: error: missing loop condition in loop with 'GCC novector' pragma
before ';' token
4 | for (;;i++)
|
should do it, will
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115623
Tamar Christina changed:
What|Removed |Added
Status|NEW |ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115531
Tamar Christina changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115120
--- Comment #5 from Tamar Christina ---
considering ivopts bails out on doloop prediction for multiple exits anyway,
what do you think about:
diff --git a/gcc/tree-ssa-loop-ivcanon.cc b/gcc/tree-ssa-loop-ivcanon.cc
index
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115623
Tamar Christina changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115936
Tamar Christina changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115934
--- Comment #4 from Tamar Christina ---
This one looks a bit like costing,
before the patch IVopts had:
:
inv_expr 1: -element_7(D)
inv_expr 2: (signed int) rite_5(D) - (signed int) element_7(D)
and after the patch it generates a few
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115936
--- Comment #4 from Tamar Christina ---
(In reply to Richard Biener from comment #3)
> iv->step should never be a pointer type
That's what I initially thought too. My suspicion is that there is some code
that tries to create the 0 offset.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115936
Tamar Christina changed:
What|Removed |Added
Ever confirmed|0 |1
CC|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115936
Tamar Christina changed:
What|Removed |Added
Target Milestone|--- |15.0
--- Comment #2 from Tamar
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115934
Tamar Christina changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115934
--- Comment #1 from Tamar Christina ---
Hi, thanks for the report, could you tell me a target triple I can use for
nvptx?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115934
--- Comment #7 from Tamar Christina ---
(In reply to Thomas Schwinge from comment #6)
> Tamar, Richard, thanks for having a look.
>
> (In reply to Tamar Christina from comment #4)
> > This one looks a bit like costing, [...]
>
> I see. So we
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115936
--- Comment #5 from Tamar Christina ---
(In reply to Richard Biener from comment #3)
> iv->step should never be a pointer type
This is created by SCEV.
simple_iv_with_niters in the case where no CHREC is found creates an IV with
base == ev,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115936
--- Comment #6 from Tamar Christina ---
(In reply to Richard Biener from comment #3)
> iv->step should never be a pointer type
This is created by SCEV.
simple_iv_with_niters in the case where no CHREC is found creates an IV with
base == ev,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
Bug 53947 depends on bug 115531, which changed state.
Bug 115531 Summary: vectorizer generates inefficient code for masked
conditional update loops
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115531
What|Removed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115531
Tamar Christina changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90608
--- Comment #20 from Tamar Christina ---
Hi Mikael,
I did regression testing on x86_64 and AArch64 and only found one test-ism.
I think I understand most of the patch to be able to deal with any fallout,
would it be ok if I fix the test-ism
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106783
--- Comment #8 from Tamar Christina ---
(In reply to Jan Hubicka from comment #6)
> The problem is that n/=0 is undefined behavior (so we can optimize out call
> to function doing divide by zero), while __builtin_trap is observable and we
> do
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90608
--- Comment #22 from Tamar Christina ---
(In reply to Mikael Morin from comment #21)
> (In reply to Tamar Christina from comment #20)
> > Hi Mikael,
> >
> > I did regression testing on x86_64 and AArch64 and only found one test-ism.
> >
> > I
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116074
--- Comment #7 from Tamar Christina ---
The backend is returning TImode for get_vectype_for_scalar_type for historical
reasons where large integer modes were considered struct types and this vector
modes.
However they're not modes the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116074
Tamar Christina changed:
What|Removed |Added
Last reconfirmed||2024-07-25
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116074
--- Comment #8 from Tamar Christina ---
Going with a backend fix instead.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116074
Tamar Christina changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
Bug 53947 depends on bug 116074, which changed state.
Bug 116074 Summary: [15 regression] ICE when building harfbuzz-9.0.0 on arm64
(related_int_vector_mode, at stor-layout.cc:581)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116074
701 - 790 of 790 matches
Mail list logo