https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80846
--- Comment #28 from Peter Cordes ---
(In reply to Richard Biener from comment #27)
> Note that this is deliberately left as-is because the target advertises
> (cheap) support for horizontal reduction. The vectorizer simply generates
> a single
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80846
Richard Biener changed:
What|Removed |Added
Status|REOPENED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80846
--- Comment #26 from Richard Biener ---
(In reply to Peter Cordes from comment #25)
> We're getting a spill/reload inside the loop with AVX512:
>
> .L2:
> vmovdqa64 (%esp), %zmm3
> vpaddd (%eax), %zmm3, %zmm2
> addl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80846
--- Comment #25 from Peter Cordes ---
We're getting a spill/reload inside the loop with AVX512:
.L2:
vmovdqa64 (%esp), %zmm3
vpaddd (%eax), %zmm3, %zmm2
addl$64, %eax
vmovdqa64 %zmm2, (%esp)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80846
Rainer Orth changed:
What|Removed |Added
CC||ro at gcc dot gnu.org
--- Comment #23
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80846
--- Comment #24 from Rainer Orth ---
The new gcc.target/i386/pr80846-1.c testcase FAILs on Solaris/x86 (32 and
64-bit):
+FAIL: gcc.target/i386/pr80846-1.c scan-assembler-times vextracti 2 (found 1
times)
Assembler output attached.
Rainer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80846
--- Comment #22 from Peter Cordes ---
Forgot the Godbolt link with updated cmdline options:
https://godbolt.org/g/FCZAEj.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80846
Peter Cordes changed:
What|Removed |Added
Status|RESOLVED|REOPENED
Resolution|FIXED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80846
Richard Biener changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80846
--- Comment #19 from Richard Biener ---
Author: rguenth
Date: Fri Jan 12 11:43:13 2018
New Revision: 256576
URL: https://gcc.gnu.org/viewcvs?rev=256576=gcc=rev
Log:
2018-01-12 Richard Biener
PR
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80846
--- Comment #18 from Aldy Hernandez ---
Author: aldyh
Date: Wed Sep 13 16:15:07 2017
New Revision: 252229
URL: https://gcc.gnu.org/viewcvs?rev=252229=gcc=rev
Log:
PR target/80846
* config/rs6000/vsx.md (vextract_fp_from_shorth,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80846
--- Comment #17 from Aldy Hernandez ---
Author: aldyh
Date: Wed Sep 13 16:10:45 2017
New Revision: 252207
URL: https://gcc.gnu.org/viewcvs?rev=252207=gcc=rev
Log:
PR target/80846
* optabs.def (vec_extract_optab, vec_init_optab):
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80846
--- Comment #16 from Jakub Jelinek ---
(In reply to rguent...@suse.de from comment #15)
> Yeah, I have a patch that does this. The question is how to query the target
> if the vector sizes share the same register set. Like we wouldn't want to go
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80846
--- Comment #15 from rguenther at suse dot de ---
On September 7, 2017 1:53:47 PM GMT+02:00, "jakub at gcc dot gnu.org"
wrote:
>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80846
>
>--- Comment #14 from Jakub Jelinek
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80846
--- Comment #14 from Jakub Jelinek ---
(In reply to Richard Biener from comment #11)
> that's not using the unpacking strategy (sum adjacent elements) but still the
> vector shift approach (add upper/lower halves). That's sth that can be
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80846
--- Comment #13 from Jakub Jelinek ---
Author: jakub
Date: Tue Aug 1 16:12:31 2017
New Revision: 250784
URL: https://gcc.gnu.org/viewcvs?rev=250784=gcc=rev
Log:
PR target/80846
* config/rs6000/vsx.md (vextract_fp_from_shorth,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80846
--- Comment #12 from Jakub Jelinek ---
Author: jakub
Date: Tue Aug 1 08:26:14 2017
New Revision: 250759
URL: https://gcc.gnu.org/viewcvs?rev=250759=gcc=rev
Log:
PR target/80846
* optabs.def (vec_extract_optab, vec_init_optab):
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80846
--- Comment #11 from Richard Biener ---
So after Jakubs update the vectorizer patch yields
sumint:
.LFB0:
.cfi_startproc
vpxor %xmm0, %xmm0, %xmm0
leaq4096(%rdi), %rax
.p2align 4,,10
.p2align 3
.L2:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80846
--- Comment #10 from Jakub Jelinek ---
Author: jakub
Date: Thu Jul 20 16:36:18 2017
New Revision: 250397
URL: https://gcc.gnu.org/viewcvs?rev=250397=gcc=rev
Log:
PR target/80846
* config/i386/i386.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80846
Jakub Jelinek changed:
What|Removed |Added
CC||jakub at gcc dot gnu.org
--- Comment #9
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80846
--- Comment #8 from Richard Biener ---
Created attachment 41422
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41422=edit
adjusted tree-vect-loop.c hunk
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80846
--- Comment #7 from Richard Biener ---
Note that similar to the vec_init optab not allowing constructing larger
vectors from smaller ones vec_extract doesn't allow extracting smaller vectors
from larger ones. So I might be forced to go V8SI ->
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80846
--- Comment #6 from Richard Biener ---
Similar with AVX512F I get
.L2:
vmovdqa64 -112(%rbp), %zmm3
addq$64, %rdi
vpaddd -64(%rdi), %zmm3, %zmm2
cmpq%rdi, %rax
vmovdqa64 %zmm2,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80846
--- Comment #5 from Richard Biener ---
Created attachment 41421
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41421=edit
WIP patch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80846
--- Comment #4 from Richard Biener ---
(define_expand "3"
[(set (match_operand:VI_AVX2 0 "register_operand")
(plusminus:VI_AVX2
(match_operand:VI_AVX2 1 "vector_operand")
(match_operand:VI_AVX2 2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80846
Richard Biener changed:
What|Removed |Added
CC||vmakarov at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80846
--- Comment #2 from Peter Cordes ---
(In reply to Richard Biener from comment #1)
> That is, it was supposed to end up using pslldq
I think you mean PSRLDQ. Byte zero is the right-most when drawn in a way that
makes bit/byte shift directions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80846
Richard Biener changed:
What|Removed |Added
Status|UNCONFIRMED |ASSIGNED
Last reconfirmed|
28 matches
Mail list logo