[Bug target/69493] Poor code generation for return of struct containing vectors on PPC64LE

2024-07-25 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69493

Sam James  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
 Blocks||101926


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101926
[Bug 101926] [meta-bug] struct/complex/other argument passing and return should
be improved

[Bug target/69493] Poor code generation for return of struct containing vectors on PPC64LE

2020-05-26 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69493

--- Comment #11 from Segher Boessenkool  ---
Why does our unpack expander use UNSPEC_UNPACK_128BIT at all, why
can it not simply generate simple code (without unspecs) directly?

(Same goes for "pack").

[Bug target/69493] Poor code generation for return of struct containing vectors on PPC64LE

2020-05-25 Thread luoxhu at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69493

--- Comment #10 from luoxhu at gcc dot gnu.org ---
In expand, Power8 will emit two register permute instructions to byte swap the
contents by rs6000_emit_le_vsx_move.

P9:
5: NOTE_INSN_BASIC_BLOCK 2
2: r129:TF=%1:TF
3: r130:TF=%3:TF
4: NOTE_INSN_FUNCTION_BEG
7: r117:DF=unspec[r129:TF,0] 70
8: r131:V2DF=r121:V2DF
9: r133:DF=vec_select(r131:V2DF,parallel)
   10: r131:V2DF=vec_concat(r117:DF,r133:DF)
   11: r122:V2DF=r131:V2DF
   12: r118:DF=unspec[r129:TF,0x1] 70
   13: r119:DF=unspec[r130:TF,0] 70
   14: r134:V2DF=r124:V2DF
   15: r136:DF=vec_select(r134:V2DF,parallel)
   16: r134:V2DF=vec_concat(r119:DF,r136:DF)
   17: r125:V2DF=r134:V2DF
   18: r120:DF=unspec[r130:TF,0x1] 70
   19: r137:V2DF=r122:V2DF
   20: r139:DF=vec_select(r137:V2DF,parallel)
   21: r137:V2DF=vec_concat(r139:DF,r118:DF)
   22: [r112:DI]=r137:V2DF
   23: r140:V2DF=r125:V2DF
   24: r142:DF=vec_select(r140:V2DF,parallel)
   25: r140:V2DF=vec_concat(r142:DF,r120:DF)
   26: [r112:DI+0x10]=r140:V2DF
   27: r143:V4SI=[r112:DI]
   28: r144:V4SI=[r112:DI+0x10]
   29: r127:V4SI=r143:V4SI
   30: r128:V4SI=r144:V4SI
   34: %2:V4SI=r127:V4SI
   35: %3:V4SI=r128:V4SI
   36: use %2:V4SI
   37: use %3:V4SI

P8:
5: NOTE_INSN_BASIC_BLOCK 2
2: r129:TF=%1:TF
3: r130:TF=%3:TF
4: NOTE_INSN_FUNCTION_BEG
7: r117:DF=unspec[r129:TF,0] 70
8: r131:V2DF=r121:V2DF
9: r133:DF=vec_select(r131:V2DF,parallel)
   10: r131:V2DF=vec_concat(r117:DF,r133:DF)
   11: r122:V2DF=r131:V2DF
   12: r118:DF=unspec[r129:TF,0x1] 70
   13: r119:DF=unspec[r130:TF,0] 70
   14: r134:V2DF=r124:V2DF
   15: r136:DF=vec_select(r134:V2DF,parallel)
   16: r134:V2DF=vec_concat(r119:DF,r136:DF)
   17: r125:V2DF=r134:V2DF
   18: r120:DF=unspec[r130:TF,0x1] 70
   19: r137:V2DF=r122:V2DF
   20: r139:DF=vec_select(r137:V2DF,parallel)
   21: r137:V2DF=vec_concat(r139:DF,r118:DF)
   22: r140:V2DF=vec_select(r137:V2DF,parallel)
   23: [r112:DI]=vec_select(r140:V2DF,parallel)
   24: r141:V2DF=r125:V2DF
   25: r143:DF=vec_select(r141:V2DF,parallel)
   26: r141:V2DF=vec_concat(r143:DF,r120:DF)
   27: r144:V2DF=vec_select(r141:V2DF,parallel)
   28: [r112:DI+0x10]=vec_select(r144:V2DF,parallel)
   29: r146:V4SI=vec_select([r112:DI],parallel)
   30: r145:V4SI=vec_select(r146:V4SI,parallel)
   31: r148:V4SI=vec_select([r112:DI+0x10],parallel)
   32: r147:V4SI=vec_select(r148:V4SI,parallel)
   33: r127:V4SI=r145:V4SI
   34: r128:V4SI=r147:V4SI
   38: %2:V4SI=r127:V4SI
   39: %3:V4SI=r128:V4SI
   40: use %2:V4SI
   41: use %3:V4SI

Difference starts from #22. Power8 will emit two vec_select instructions for
stack store/load operations. But power9 needs only one.

[Bug target/69493] Poor code generation for return of struct containing vectors on PPC64LE

2020-05-20 Thread luoxhu at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69493

luoxhu at gcc dot gnu.org changed:

   What|Removed |Added

 CC||luoxhu at gcc dot gnu.org

--- Comment #9 from luoxhu at gcc dot gnu.org ---
No load/store on Power9.
cat pr69493.s
.file   "pr69493.c"
.abiversion 2
.section".text"
.align 2
.p2align 4,,15
.globl test_big_double
.type   test_big_double, @function
test_big_double:
.LFB0:
.cfi_startproc
mfvsrd 7,1
mfvsrd 10,2
mfvsrd 8,3
mfvsrd 9,4
mtvsrdd 34,10,7
mtvsrdd 35,9,8
blr
.long 0
.byte 0,0,0,0,0,0,0,0
.cfi_endproc
.LFE0:
.size   test_big_double,.-test_big_double
.ident  "GCC: (GNU) 9.2.1 20191023 (Advance-Toolchain 13.0-1)
[aba1f4e8b6ac]"
.gnu_attribute 4, 5
.section.note.GNU-stack,"",@progbits

[Bug target/69493] Poor code generation for return of struct containing vectors on PPC64LE

2018-11-15 Thread bergner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69493

--- Comment #8 from Peter Bergner  ---
I'll note that Kelvin's r256656 commit fixed the test case in Comment 6 because
we know the loads and stores are sufficiently aligned and there are loads and
stores that will do the correct byte swap in LE mode if the address is aligned.

However, we still produce poor code for the first test case.

[Bug target/69493] Poor code generation for return of struct containing vectors on PPC64LE

2017-11-17 Thread bergner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69493

--- Comment #7 from Peter Bergner  ---
(In reply to Peter Bergner from comment #6)
> When compiling for POWER9, we get the code we want/expect:

FYI, we also get optimal code (ie, just a blr) when compiling on POWER8 BE.

[Bug target/69493] Poor code generation for return of struct containing vectors on PPC64LE

2017-09-27 Thread bergner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69493

Peter Bergner  changed:

   What|Removed |Added

 CC||bergner at gcc dot gnu.org

--- Comment #6 from Peter Bergner  ---
A simpler test case that shows the same problem when compiling for POWER8. When
compiling for POWER9, we get the code we want/expect:

bergner@pike:~/gcc/BUGS/PR70053$ cat pr69493-2.c 
typedef struct
{
  __vector double vx0;
  __vector double vx1;
} vec_t;

vec_t
foo (__vector double a, __vector double b)
{
  vec_t result;
  result.vx0 = a;
  result.vx1 = b;
  return result;
}

bergner@pike:~/gcc/BUGS/PR70053$
/home/bergner/gcc/build/gcc-fsf-mainline-pr70053-debug/gcc/xgcc
-B/home/bergner/gcc/build/gcc-fsf-mainline-pr70053-debug/gcc -S -O2
-mcpu=power8 pr69493-2.c 
bergner@pike:~/gcc/BUGS/PR70053$ cat pr69493-2.s 
...
foo:
addi 8,1,-96
li 10,32
xxpermdi 34,34,34,2
xxpermdi 35,35,35,2
li 9,48
stxvd2x 34,8,10
stxvd2x 35,8,9
lxvd2x 34,8,10
lxvd2x 35,8,9
xxpermdi 34,34,34,2
xxpermdi 35,35,35,2
blr


bergner@pike:~/gcc/BUGS/PR70053$
/home/bergner/gcc/build/gcc-fsf-mainline-pr70053-debug/gcc/xgcc
-B/home/bergner/gcc/build/gcc-fsf-mainline-pr70053-debug/gcc -S -O2
-mcpu=power9 pr69493-2.c 

bergner@pike:~/gcc/BUGS/PR70053$ cat pr69493-2.s 
...
foo:
blr

[Bug target/69493] Poor code generation for return of struct containing vectors on PPC64LE

2016-03-10 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69493

--- Comment #5 from Segher Boessenkool  ---
Ah, needs -mlittle, not just -mabi=elfv2.

[Bug target/69493] Poor code generation for return of struct containing vectors on PPC64LE

2016-03-09 Thread wschmidt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69493

--- Comment #4 from Bill Schmidt  ---
I still see the problem with:

GCC: (GNU) 6.0.0 20160309 (experimental) [trunk revision 234085]

[Bug target/69493] Poor code generation for return of struct containing vectors on PPC64LE

2016-03-09 Thread wschmidt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69493

--- Comment #3 from Bill Schmidt  ---
That's interesting.  We have some other examples of similar issues we should
check as well before closing this.  I'll take a look in a bit.

[Bug target/69493] Poor code generation for return of struct containing vectors on PPC64LE

2016-03-09 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69493

--- Comment #2 from Segher Boessenkool  ---
This seems fixed on current trunk (dse1 removes the reload from mem)?

[Bug target/69493] Poor code generation for return of struct containing vectors on PPC64LE

2016-01-26 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69493

Segher Boessenkool  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-01-26
 CC||segher at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Segher Boessenkool  ---
Confirmed.  At expand time it already goes via memory.