https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #37 from Xionghu Luo (luoxhu at gcc dot gnu.org) ---
https://gcc.gnu.org/pipermail/gcc-patches/2023-March/614932.html
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #36 from Peter Bergner ---
(In reply to Jakub Jelinek from comment #34)
> What is the state of this PR? I see patches posted in August, but don't see
> anything committed...
I've seen some patch submissions and pings in February
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #35 from Jakub Jelinek ---
Ping again.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
Jakub Jelinek changed:
What|Removed |Added
CC||jakub at gcc dot gnu.org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
Richard Biener changed:
What|Removed |Added
Target Milestone|12.2|12.3
--- Comment #33 from Richard
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #32 from Xionghu Luo (luoxhu at gcc dot gnu.org) ---
Thanks for all the information! It inspires to me that "native RTL should be
endian-independent". So both big-endian and little-endian platform should
generate same (vec_select
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #31 from Xionghu Luo (luoxhu at gcc dot gnu.org) ---
Created attachment 53408
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53408=edit
0001-rs6000-Fix-incorrect-RTL-for-Power-LE-when-removing-
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #30 from Richard Earnshaw ---
(In reply to rsand...@gcc.gnu.org from comment #29)
> (In reply to Segher Boessenkool from comment #28)
> > (In reply to rsand...@gcc.gnu.org from comment #25)
> > > - On big-endian targets, vector
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #29 from rsandifo at gcc dot gnu.org
---
(In reply to Segher Boessenkool from comment #28)
> (In reply to rsand...@gcc.gnu.org from comment #25)
> > - On big-endian targets, vector loads and stores are assumed to put the
> >
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #28 from Segher Boessenkool ---
(In reply to rsand...@gcc.gnu.org from comment #25)
> - On big-endian targets, vector loads and stores are assumed to put the
> first memory element at the most significant end of the vector
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #27 from Segher Boessenkool ---
IMO what vec_select calls element 0 is always in the first argument of the
vec_concat it works on, in BE as well as LE. But yes this is quite
underdefined
in our documentation, and who know what is
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #26 from rsandifo at gcc dot gnu.org
---
> describes a different option on big-endian and little-endian
should have said: describes a different instruction. In other words,
the mapping of gimple to RTL operations is fixed, but
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #25 from rsandifo at gcc dot gnu.org
---
AIUI the rules are:
- GCC vector lane numbers always correspond to memory array indices.
For example, lane 0 always comes first in memory.
- On big-endian targets, vector loads and
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
Richard Biener changed:
What|Removed |Added
CC||rearnsha at gcc dot gnu.org,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #23 from Kewen Lin ---
> Ideally we would avoid semantic difference of RTL depending on the target.
> If that's not avoidable there should be target macros/hooks that specify
> the desired semantics.
Not sure, IMHO it seems it
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #22 from rguenther at suse dot de ---
On Wed, 3 Aug 2022, linkw at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
>
> --- Comment #21 from Kewen Lin ---
> I didn't look into this in details, but
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #21 from Kewen Lin ---
I didn't look into this in details, but something in the culprit commit caught
my eyes, take altivec_vmrghh as example:
Before the patch, the pattern
[(set (match_operand:V8HI 0 "register_operand" "=v")
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #20 from Xionghu Luo (luoxhu at gcc dot gnu.org) ---
Another reference is manually change the generated assembly with modifying the
source and index vspltw to verify:
luoxhu@gcc135 build $ diff q.bad.s q.good.s -U12
--- q.bad.s
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #19 from Xionghu Luo (luoxhu at gcc dot gnu.org) ---
(In reply to Xionghu Luo (luo...@gcc.gnu.org) from comment #15)
> In combine: vec_select(vec_concat and the followed vec_select are combined
> to a single extract instruction,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #18 from Richard Biener ---
(In reply to Richard Biener from comment #17)
> Seeing
>
> Trying 21 -> 24:
>21: r150:V4SI=vec_select(vec_concat(r146:V4SI,r141:V4SI),parallel)
> REG_DEAD r146:V4SI
> REG_DEAD r141:V4SI
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #17 from Richard Biener ---
Seeing
Trying 21 -> 24:
21: r150:V4SI=vec_select(vec_concat(r146:V4SI,r141:V4SI),parallel)
REG_DEAD r146:V4SI
REG_DEAD r141:V4SI
24: {r151:SI=vec_select(r150:V4SI,parallel);clobber
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #16 from luoxhu at gcc dot gnu.org ---
The attached files are all built with -mcpu=power8 and the case also fails on
P8LE.
Also I verified the code produces expected output on P8BE. ('Aborted' is caused
by BE returns 0x41 instead of
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #15 from luoxhu at gcc dot gnu.org ---
In combine: vec_select(vec_concat and the followed vec_select are combined to a
single extract instruction, which seems reasonable for both LE and BE?
R146: 0 1 2 3
R141: 4 5 6 7
R150: 2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #14 from luoxhu at gcc dot gnu.org ---
Created attachment 53354
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53354=edit
split2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #13 from luoxhu at gcc dot gnu.org ---
Created attachment 53353
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53353=edit
after combine
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #12 from luoxhu at gcc dot gnu.org ---
Created attachment 53352
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53352=edit
combine
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #11 from Segher Boessenkool ---
I mean, if that patch is actually flawed, this is GCC 12 and latter; if the
problem is more generic (combine, probably simplify-rtx to be exact) it is
more widespread.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #10 from Segher Boessenkool ---
This happened after
commit 0910c516a3d72af048af27308349167f25c406c2
Author: Xionghu Luo
Date: Tue Oct 19 04:02:04 2021 -0500
which probably caused it. That means it would be GCC 12 and
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
Richard Biener changed:
What|Removed |Added
Status|UNCONFIRMED |NEW
Last reconfirmed|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #8 from luoxhu at gcc dot gnu.org ---
init-regs:
(insn 13 8 17 2 (set (reg:V4SI 141)
(vec_select:V4SI (vec_concat:V8SI (reg/v:V4SI 135 [ R2 ])
(reg/v:V4SI 133 [ R0 ]))
(parallel [
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #7 from Segher Boessenkool ---
(The original insns, before this combination.)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #6 from Segher Boessenkool ---
What is wrong there? It isn't obvious. You may need to show insns 188 and 199
in non-slim form, "slim" is very lossy.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #5 from luoxhu at gcc dot gnu.org ---
Seems combine wrongly merged two vec_select instructions:
Trying 188 -> 199:
188: r343:V4SI=vec_select(vec_concat(r168:V4SI,r338:V4SI),parallel)
REG_DEAD r338:V4SI
REG_DEAD
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #4 from luoxhu at gcc dot gnu.org ---
Reduced to:
#include
extern "C" void *memcpy(void *, const void *, unsigned long);
typedef __attribute__((altivec(vector__))) unsigned native_simd_type;
union {
native_simd_type V;
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #3 from Marek Polacek ---
Sure. (If you're looking for a ppc64le machine, the compile farm has a few.)
$ diff -up q95.s q96.s
--- q95.s 2022-06-23 23:08:22.870777519 +
+++ q96.s 2022-06-23 23:08:10.990476157 +
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #2 from luoxhu at gcc dot gnu.org ---
Could you also paste the ASM difference please? (I don't have environment at
handle so far..)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #1 from Marek Polacek ---
The difference between r12-4495 and r12-4496:
$ diff -up b/q.C.252r.expand a/q.C.252r.expand
--- b/q.C.252r.expand 2022-06-23 23:16:44.753507476 +
+++ a/q.C.252r.expand 2022-06-23
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
Marek Polacek changed:
What|Removed |Added
Target Milestone|--- |12.2
Keywords|
38 matches
Mail list logo