https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115799
Tamar Christina changed:
What|Removed |Added
CC||tnfchris at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104265
--- Comment #5 from Tamar Christina ---
Also for fully masked architectures we can instead of recreating the vectors
just mask out the irrelevant values.
But we should still order the exits based on complexity.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104265
--- Comment #4 from Tamar Christina ---
(In reply to Richard Biener from comment #3)
> Note the SLP discovery opportunity is from the "reduction" PHI to the
> return which merges control flow to a zero/one flag.
Right, so I get what you mean
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90608
--- Comment #19 from Tamar Christina ---
Hi Mikael,
It looks like the last version of your patch already gets inlined in the call
sites we cared about.
Would it be possible for you to upstream it?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115623
Tamar Christina changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115629
--- Comment #6 from Tamar Christina ---
(In reply to rguent...@suse.de from comment #5)
> > In this case, the second load is conditional on the first load mask, which
> > means it's already done an AND.
> > And crucially inverting it means you
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88545
--- Comment #12 from Tamar Christina ---
I had a bug in the benchmark, I forgot to set taskset,
These are the correct ones:
++---+-+-+
| NEEDLE | scalar 1x | vect| memchr |
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115629
--- Comment #4 from Tamar Christina ---
(In reply to Richard Biener from comment #3)
> So we now tail-merge the two b[i] loading blocks. Can you check SVE
> code-gen with this? If that fixes the PR consider adding a SVE testcase.
Thanks, the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88545
--- Comment #11 from Tamar Christina ---
(In reply to Jonathan Wakely from comment #9)
> Patch posted: https://gcc.gnu.org/pipermail/gcc-patches/2024-June/653731.html
>
> Rerunning benchmarks with this patch would be very welcome.
OK, I have
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115120
--- Comment #5 from Tamar Christina ---
considering ivopts bails out on doloop prediction for multiple exits anyway,
what do you think about:
diff --git a/gcc/tree-ssa-loop-ivcanon.cc b/gcc/tree-ssa-loop-ivcanon.cc
index
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115629
Bug ID: 115629
Summary: Inefficient if-convert of masked conditionals
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115531
Tamar Christina changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115623
--- Comment #4 from Tamar Christina ---
novect3.c: In function 'void f(char*, int)':
novect3.c:4:9: error: missing loop condition in loop with 'GCC novector' pragma
before ';' token
4 | for (;;i++)
|
should do it, will
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115623
Tamar Christina changed:
What|Removed |Added
Status|NEW |ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115120
--- Comment #4 from Tamar Christina ---
You asked why this doesn't happen with a normal vector loop Richi.
For a normal loop when IVcannon adds the downward counting loop there are two
main differences.
1. for a single exit loop, the downward
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115597
--- Comment #4 from Tamar Christina ---
(In reply to Richard Biener from comment #2)
> Ah, I feared this would happen - this case seems to be because of a lot of
> VEC_PERM nodes(?) which are not handled by the CSE process as well as the
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115597
--- Comment #3 from Tamar Christina ---
>
> Can you check whether that fixes the issue?
>
> diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> index 9465d94de1a..212d5f97f7d 100644
> --- a/gcc/tree-vect-slp.cc
> +++
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115597
Bug ID: 115597
Summary: [15 Regression] vectorizer takes 20+ h compiling
510.parest in SPECCPU2017 since
g:46bb4ce4d30ab749d40f6f4cef6f1fb7c7813452
Product: gcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115534
--- Comment #5 from Tamar Christina ---
(In reply to Andrew Pinski from comment #4)
> This might be improved by
> https://gcc.gnu.org/pipermail/gcc-patches/2024-June/654819.html . Or it
> might be the case the vectorizer case needs to be
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115537
--- Comment #5 from Tamar Christina ---
Thanks for the fix!
I think the testcase needs SVE enabled to ICE no?
shouldn't that be -mcpu=neoverse-v1 and not -mcpu=neoverse-n1?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115534
--- Comment #2 from Tamar Christina ---
(In reply to Andrew Pinski from comment #1)
> I suspect there is a dup of this already. See the bug which I made this one
> blocking for a list of related bugs.
Most of the other bugs relate to the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115537
Bug ID: 115537
Summary: [15 Regression] vectorizable_reduction ICEs after
g:d66b820f392aa9a7c34d3cddaf3d7c73bf23f82d
Product: gcc
Version: 15.0
Status: UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115534
Bug ID: 115534
Summary: intermediate stack use not eliminated
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115531
--- Comment #3 from Tamar Christina ---
(In reply to Andrew Pinski from comment #1)
> I suspect PR 20999 would fix this ...
> but we have to be careful since without masked stores, you could still
> vectorize this unlike the transformed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115531
Bug ID: 115531
Summary: vectorizer generates inefficient code for masked
conditional update loops
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Keywords:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115464
--- Comment #10 from Tamar Christina ---
Thanks for the fix, but I don't think it's sufficient.
what I meant with the earlier comment was that the subregs are broken in
general, so not just the one generated by the undef fast path.
i.e.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115464
--- Comment #7 from Tamar Christina ---
(In reply to Tamar Christina from comment #6)
> (In reply to Richard Sandiford from comment #5)
> > In this kind of situation, we should go through a fresh pseudo rather than
> > try to take the subreg
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115464
--- Comment #6 from Tamar Christina ---
(In reply to Richard Sandiford from comment #5)
> In this kind of situation, we should go through a fresh pseudo rather than
> try to take the subreg directly.
I did try that but fwprop pushed it back
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115464
Tamar Christina changed:
What|Removed |Added
CC||rsandifo at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115464
Tamar Christina changed:
What|Removed |Added
Last reconfirmed||2024-06-12
CC|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932
--- Comment #15 from Tamar Christina ---
(In reply to rguent...@suse.de from comment #14)
> On Thu, 6 Jun 2024, tnfchris at gcc dot gnu.org wrote:
>
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932
> >
> > --- Comment #13 from Tamar
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932
--- Comment #13 from Tamar Christina ---
(In reply to rguent...@suse.de from comment #12)
> > since we don't care about overflow here, it looks like the stripping should
> > be recursive as long as it's a NOP expression between two integral
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932
--- Comment #11 from Tamar Christina ---
(In reply to Richard Biener from comment #10)
> I think the question is why IVOPTs ends up using both the signed and
> unsigned variant of the same IV instead of expressing all uses of both with
> one
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54013
Tamar Christina changed:
What|Removed |Added
Blocks||115130
--- Comment #4 from Tamar
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932
--- Comment #9 from Tamar Christina ---
It's taken me a bit of time to track down all the reasons for the speedup with
the earlier patch.
This comes from two parts:
1. Signed IVs don't get simplified. Due to possible UB with signed overflows
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114860
--- Comment #9 from Tamar Christina ---
(In reply to prathamesh3492 from comment #8)
> Hi Tamar,
> Using -falign-loops=5 indeed brings back the performance.
> The adrp instruction has same address (0x4ae784) by setting -falign-loops=5
> (which
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115130
Tamar Christina changed:
What|Removed |Added
Ever confirmed|0 |1
Last reconfirmed|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115130
Bug ID: 115130
Summary: (early-break) [meta-bug] early break vectorization
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Keywords: meta-bug, missed-optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115120
--- Comment #3 from Tamar Christina ---
That makes sense, though I also wonder how it works for scalar multi exit
loops, IVops has various checks on single exits.
I guess one problem is that the code in IVops that does this uses the exit to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114860
--- Comment #7 from Tamar Christina ---
Yeah, it's most likely an alignment issue, especially as there's no code
changes.
We run our benchmarking with different flags so it may be why we don't see it.
the loop seems misaligned, you can try
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114412
--- Comment #5 from Tamar Christina ---
(In reply to Filip Kastl from comment #4)
> (In reply to Tamar Christina from comment #3)
> > Hi Filip,
> >
> > Do you generate these runs with counters based PGO or compiler
> > instrumentation?
> >
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115087
Bug ID: 115087
Summary: dead block not eliminated in SVE intrinsics code
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114412
Tamar Christina changed:
What|Removed |Added
CC||tnfchris at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932
Tamar Christina changed:
What|Removed |Added
Ever confirmed|0 |1
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932
--- Comment #6 from Tamar Christina ---
Created attachment 58096
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58096=edit
exchange2.fppized-bad.f90.187t.ivopts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932
--- Comment #5 from Tamar Christina ---
Created attachment 58095
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58095=edit
exchange2.fppized-good.f90.187t.ivopts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932
--- Comment #4 from Tamar Christina ---
reduced more:
---
module brute_force
integer, parameter :: r=9
integer block(r, r, 0)
contains
subroutine brute
do
do
do
do
do
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932
--- Comment #3 from Tamar Christina ---
(In reply to Andrew Pinski from comment #2)
> > which is harder for prefetchers to follow.
>
> This seems like a limitation in the HW prefetcher rather than anything else.
> Maybe the cost model for
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932
Bug ID: 114932
Summary: Improvement in CHREC can give large performance gains
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92538
Tamar Christina changed:
What|Removed |Added
CC||jamborm at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114860
--- Comment #3 from Tamar Christina ---
I cannot reproduce this even recompiling libc.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114860
Tamar Christina changed:
What|Removed |Added
CC||tnfchris at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114860
--- Comment #1 from Tamar Christina ---
Hmm
I Am unable to reproduce this with -O3 - flto -mcpu=neoverse-v2 on a
neoverse-v2 machine.
Is any other option required?
Also that code was new in gcc 14 and was partially reverted due to register
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114766
--- Comment #2 from Tamar Christina ---
(In reply to Vladimir Makarov from comment #1)
> (In reply to Tamar Christina from comment #0)
> > The documentation for ^ states:
>
> If it works for you, we could try to use the patch (although it needs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114769
Tamar Christina changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114769
--- Comment #2 from Tamar Christina ---
I believe this is safe, but the interface is definitely not the cleanest.
vect_recog_absolute_difference has two callers:
1. vect_recog_sad_pattern where if you return true with unprom not set, then
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113625
Tamar Christina changed:
What|Removed |Added
CC||tnfchris at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114766
Bug ID: 114766
Summary: ^ constraint modifier unexpectedly affects register
class selection.
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Keywords:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114741
Tamar Christina changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114513
Bug 114513 depends on bug 114741, which changed state.
Bug 114741 Summary: [14 regression] aarch64 sve: unnecessary fmov for scalar
int bit operations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114741
What|Removed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114741
Tamar Christina changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114741
--- Comment #6 from Tamar Christina ---
and the exact armv9-a cost model you quoted, also does the right codegen.
https://godbolt.org/z/obafoT6cj
There is just an inexplicable penalty being applied to the r->r alternative.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114741
Tamar Christina changed:
What|Removed |Added
CC||tnfchris at gcc dot gnu.org,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113552
Tamar Christina changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114403
Tamar Christina changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114403
--- Comment #26 from Tamar Christina ---
(In reply to Richard Biener from comment #25)
> That means, when the loop takes the early exit we _must_ take that during
> the vector iterations. Peeling for gaps means if we would take the early
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114403
--- Comment #24 from Tamar Christina ---
(In reply to Richard Biener from comment #23)
> Maybe easier to understand testcase:
>
> with -O3 -msse4.1 -fno-vect-cost-model we return 20 instead of 8. Adding
> -fdisable-tree-cunroll avoids the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114403
--- Comment #22 from Tamar Christina ---
note that due to the secondary exit the actual full vector iteration count is 8
scalar elements at VF=4 == 2.
And it's this boundary condition where we fail, since ceil (8/4) == 2. any
other value would
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114403
--- Comment #21 from Tamar Christina ---
Created attachment 57932
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57932=edit
loop.c
attached reduced testcase that reproduces the issue and also checks the buffer
position and copied values.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635
--- Comment #6 from Tamar Christina ---
(In reply to Jakub Jelinek from comment #4)
> Now, with SVE/RISCV vectors the actual vectorization factor is a poly_int
> rather than constant. One possibility would be to use VLA arrays in those
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635
Bug ID: 114635
Summary: OpenMP reductions fail dependency analysis
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114577
Bug ID: 114577
Summary: Inefficient codegen for SVE/NEON bridge
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114510
Tamar Christina changed:
What|Removed |Added
CC||tnfchris at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114515
Tamar Christina changed:
What|Removed |Added
CC||tnfchris at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114575
Bug ID: 114575
Summary: [14 Regression] SVE addressing modes broken since
g:839bc42772ba7af66af3bd16efed4a69511312ae
Product: gcc
Version: 14.0
Status: UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113682
--- Comment #9 from Tamar Christina ---
(In reply to Andrew Pinski from comment #8)
> This might be the path splitting running on the gimple level causing issues
> too; see PR 112402 .
Ah that's a good shout. It looks like Richi already
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113682
Tamar Christina changed:
What|Removed |Added
Status|UNCONFIRMED |NEW
Ever confirmed|0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114403
--- Comment #20 from Tamar Christina ---
This is a bad interaction with early break and peeling for gaps.
when peeling for gaps we set bias_for_lowest to 0, which then negates the ceil
for the upper bound calculation when the div is exact.
We
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114403
Tamar Christina changed:
What|Removed |Added
Status|UNCONFIRMED |ASSIGNED
Last reconfirmed|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114345
--- Comment #5 from Tamar Christina ---
(In reply to Richard Biener from comment #4)
> Well, the shuffling in .LOAD_LANES will be a bit awkward to do, but sure. We
> basically lack "constant folding" of .LOAD_LANES and similarly of course
> we
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114350
Bug ID: 114350
Summary: missing support for SVE widening floating point
conversion
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Keywords:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114345
--- Comment #3 from Tamar Christina ---
(In reply to Andrew Pinski from comment #2)
> Oh VN does have some knowledge of MASK_STORE and LEN_STORE. Just not
> LOAD_LANES .
>
>
> See PR 106365 for MASK_STORE and LEN_STORE implementation.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114346
Bug ID: 114346
Summary: vectorizer generates the same IV twice
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114345
Bug ID: 114345
Summary: FRE missing knowledge of semantics of IFN loads
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114339
--- Comment #6 from Tamar Christina ---
vectorizer generates:
mask_patt_21.19_58 = vect_perm_even_49 >= vect_cst__57;
mask_patt_21.19_59 = vect_perm_even_55 >= vect_cst__57;
vexit_reduc_63 = mask_patt_21.19_58 | mask_patt_21.19_59;
if
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114151
--- Comment #17 from Tamar Christina ---
> So doing in the vectorizer sth like the following should get us the best
> possible ranges? Ah, probably only global ranges since the SCEV query
> itself would still lack context sensitive info (but
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114234
Tamar Christina changed:
What|Removed |Added
Last reconfirmed||2024-03-05
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114192
Tamar Christina changed:
What|Removed |Added
Ever confirmed|0 |1
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98877
--- Comment #12 from Tamar Christina ---
and it's not the first time we have conditional lowering. We already do so for
e.g. shifts, where shifting by an amount => bitsize of a vector element is
defined behavior or AArch64.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98877
--- Comment #11 from Tamar Christina ---
(In reply to Andrew Pinski from comment #10)
> (In reply to Tamar Christina from comment #9)
> > While RA should be able to deal with this,
> > shouldn't we also just lower TBLs in gimple?
> >
> > This
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114151
--- Comment #3 from Tamar Christina ---
>
> This was a correctness fix btw, so I'm not sure we can easily recover - we
> could try using niter information for CHREC_VARIABLE but then there's
> variable niter here so I don't see a chance.
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114151
Bug ID: 114151
Summary: [14 Regression] weird and inefficient codegen and
addressing modes since
g:a0b1798042d033fd2cc2c806afbb77875dd2909b
Product: gcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98877
--- Comment #9 from Tamar Christina ---
While RA should be able to deal with this,
shouldn't we also just lower TBLs in gimple?
This no reason why this can't be a VEC_PERM_EXPR which would also get the
copies
removed at the gimple level and
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102171
--- Comment #3 from Tamar Christina ---
(In reply to Andrew Pinski from comment #2)
> I think I am going to implement this (or assign it interally to someone else
> to implement).
If you do, please also remove them from arm_neon.h and use the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441
Tamar Christina changed:
What|Removed |Added
CC||rsandifo at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86530
--- Comment #8 from Tamar Christina ---
(In reply to Andrew Pinski from comment #6)
> With my patch for V4QI, we still don't get the best code:
> vect_perm_even_271 = VEC_PERM_EXPR 4, 6 }>;
> vect_perm_even_273 = VEC_PERM_EXPR 4, 6 }>;
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441
--- Comment #27 from Tamar Christina ---
Created attachment 57538
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57538=edit
proposed1.patch
proposed patch, this gets the gathers and scatters back. doing regression run.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114099
--- Comment #8 from Tamar Christina ---
Created attachment 57537
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57537=edit
uses.patch
new code seems sensitive to visitation order as get_virtual_phi returns NULL
for blocks which don't
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114081
Tamar Christina changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114068
--- Comment #14 from Tamar Christina ---
patch submitted
https://gcc.gnu.org/pipermail/gcc-patches/2024-February/646415.html
1 - 100 of 768 matches
Mail list logo