https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582
--- Comment #21 from CVS Commits ---
The master branch has been updated by Richard Biener :
https://gcc.gnu.org/g:90d693bdc9d71841f51d68826ffa5bd685d7f0bc
commit r12-7319-g90d693bdc9d71841f51d68826ffa5bd685d7f0bc
Author: Richard Biener
Date:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582
--- Comment #20 from CVS Commits ---
The master branch has been updated by Richard Biener :
https://gcc.gnu.org/g:f24dfc76177b3994434c8beb287cde1a9976b5ce
commit r12-7318-gf24dfc76177b3994434c8beb287cde1a9976b5ce
Author: Richard Biener
Date:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582
--- Comment #19 from CVS Commits ---
The master branch has been updated by Richard Biener :
https://gcc.gnu.org/g:61fc5e098e76c9809f35f449a70c9c8d74773d9d
commit r12-7317-g61fc5e098e76c9809f35f449a70c9c8d74773d9d
Author: Richard Biener
Date:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582
--- Comment #18 from Andrew Pinski ---
(In reply to Andrew Pinski from comment #6)
> Hmm:
> _14 = {_1, _5};
> _8 = VIEW_CONVERT_EXPR<__int128>(_14);
>
> Wouldn't it better to convert that to just (hopefully I got the order
> correct):
> t1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582
--- Comment #17 from Richard Biener ---
For
FAIL: gcc.target/i386/pr91446.c scan-assembler-times vmovdqa[^\\n\\r]*xmm[0-9]
2
we used to produce
:
0: 48 83 ec 28 sub$0x28,%rsp
4: c4 e1 f9 6e d7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582
Richard Biener changed:
What|Removed |Added
See Also||https://gcc.gnu.org/bugzill
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582
--- Comment #15 from Richard Biener ---
The patch will cause
FAIL: gcc.target/i386/pr91446.c scan-assembler-times vmovdqa[^\\n\\r]*xmm[0-9]
2
FAIL: gcc.target/i386/pr92658-avx512bw-2.c scan-assembler-times pmovsxdq 2
FAIL:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582
--- Comment #14 from Richard Biener ---
Another testcase is
struct S { double a, b; } s;
void
foo (double a, double b)
{
s.a = a;
s.b = b;
}
which also receives the same costs and compiles vectorized to
unpcklpd %xmm1,%xmm0
movaps
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582
--- Comment #13 from Richard Biener ---
Created attachment 52476
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52476=edit
minimal patch
This is a minimal untested patch adjusting APIs to allow for the cost hook to
receive a slp_node in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582
Richard Biener changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582
--- Comment #12 from rguenther at suse dot de ---
On Fri, 18 Feb 2022, jakub at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582
>
> --- Comment #11 from Jakub Jelinek ---
> True.
> So another option is to try to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582
--- Comment #11 from Jakub Jelinek ---
True.
So another option is to try to undo some of those short vectorization cases
during isel, expansion or later, though e.g. for the negdi2 case it will go
already during expansion into memory.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582
--- Comment #10 from Richard Biener ---
Btw, I think it makes sense to build libgcc with -mno-sse, maybe even
-mgeneral-regs-only. Or globally with -fno-tree-vectorize (but we likely do
not want
%xmm uses for parameter setup either with the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582
--- Comment #9 from Richard Biener ---
(In reply to Jakub Jelinek from comment #8)
> Just trying a dumb microbenchmark:
> struct S { unsigned long a, b; } s;
>
> __attribute__((noipa)) void
> foo (unsigned long a, unsigned long b)
> {
> s.a
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582
--- Comment #8 from Jakub Jelinek ---
Just trying a dumb microbenchmark:
struct S { unsigned long a, b; } s;
__attribute__((noipa)) void
foo (unsigned long a, unsigned long b)
{
s.a = a;
s.b = b;
}
int
main ()
{
int i;
for (i = 0; i <
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582
--- Comment #7 from Richard Biener ---
(In reply to Jakub Jelinek from comment #5)
> The costs look weird:
> _1 1 times scalar_store costs 12 in body
> _5 1 times scalar_store costs 12 in body
> _1 1 times vector_store costs 12 in body
> 1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582
Andrew Pinski changed:
What|Removed |Added
Last reconfirmed||2022-02-18
Keywords|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582
--- Comment #5 from Jakub Jelinek ---
The costs look weird:
_1 1 times scalar_store costs 12 in body
_5 1 times scalar_store costs 12 in body
_1 1 times vector_store costs 12 in body
1 times vec_construct costs 8 in prologue
vec_construct is
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582
--- Comment #4 from Jakub Jelinek ---
What slp does is just
- w.s.low = _1;
- w.s.high = _5;
+ _14 = {_1, _5};
+ MEM[(union *)] = _14;
I must say I don't really see that as a beneficial optimization, construction
of a vector from scalars
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582
Jakub Jelinek changed:
What|Removed |Added
CC||jakub at gcc dot gnu.org
20 matches
Mail list logo