[Bug rtl-optimization/7061] Access of bytes in struct parameters

2022-06-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=7061

--- Comment #11 from CVS Commits  ---
The master branch has been updated by Roger Sayle :

https://gcc.gnu.org/g:64d4f27a0ce47e97867512bda7fa5683acf8a134

commit r13-1282-g64d4f27a0ce47e97867512bda7fa5683acf8a134
Author: Roger Sayle 
Date:   Mon Jun 27 07:47:40 2022 +0100

Implement __imag__ of float _Complex using shufps on x86_64.

This patch is a follow-up improvement to my recent patch for
PR rtl-optimization/7061.  That patch added the test case
gcc.target/i386/pr7061-2.c:

float im(float _Complex a) { return __imag__ a; }

For which GCC on x86_64 currently generates:

movq%xmm0, %rax
shrq$32, %rax
movd%eax, %xmm0
ret

but with this patch we now generate (the same as LLVM):

shufps  $85, %xmm0, %xmm0
ret

This is achieved by providing a define_insn_and_split that allows
truncated lshiftrt:DI by 32 to be performed on either SSE or general
regs, where if the register allocator prefers to use SSE, we split
to a shufps_v4si, or if not, we use a regular shrq.

2022-06-27  Roger Sayle  

gcc/ChangeLog
PR rtl-optimization/7061
* config/i386/i386.md (*highpartdisi2): New define_insn_and_split.

gcc/testsuite/ChangeLog
PR rtl-optimization/7061
* gcc.target/i386/pr7061-2.c: Update to look for shufps.

[Bug rtl-optimization/7061] Access of bytes in struct parameters

2022-06-11 Thread david.bolvansky at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=7061

Dávid Bolvanský  changed:

   What|Removed |Added

 CC||david.bolvansky at gmail dot 
com

--- Comment #10 from Dávid Bolvanský  ---
llvm emits just:
im: # @im
shufps  xmm0, xmm0, 85  # xmm0 = xmm0[1,1,1,1]
ret

[Bug rtl-optimization/7061] Access of bytes in struct parameters

2022-06-10 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=7061

--- Comment #9 from CVS Commits  ---
The master branch has been updated by Roger Sayle :

https://gcc.gnu.org/g:1753a7120109c1d3b682f9487d6cca64fb2f0929

commit r13-1038-g1753a7120109c1d3b682f9487d6cca64fb2f0929
Author: Roger Sayle 
Date:   Fri Jun 10 15:14:23 2022 +0100

PR rtl-optimization/7061: Complex number arguments on x86_64-like ABIs.

This patch addresses the issue in comment #6 of PR rtl-optimization/7061
(a four digit PR number) from 2006 where on x86_64 complex number arguments
are unconditionally spilled to the stack.

For the test cases below:
float re(float _Complex a) { return __real__ a; }
float im(float _Complex a) { return __imag__ a; }

GCC with -O2 currently generates:

re: movq%xmm0, -8(%rsp)
movss   -8(%rsp), %xmm0
ret
im: movq%xmm0, -8(%rsp)
movss   -4(%rsp), %xmm0
ret

with this patch we now generate:

re: ret
im: movq%xmm0, %rax
shrq$32, %rax
movd%eax, %xmm0
ret

[Technically, this shift can be performed on %xmm0 in a single
instruction, but the backend needs to be taught to do that, the
important bit is that the SCmode argument isn't written to the
stack].

The patch itself is to emit_group_store where just before RTL
expansion commits to writing to the stack, we check if the store
group consists of a single scalar integer register that holds
a complex mode value; on x86_64 SCmode arguments are passed in
DImode registers.  If this is the case, we can use a SUBREG to
"view_convert" the integer to the equivalent complex mode.

An interesting corner case that showed up during testing is that
x86_64 also passes HCmode arguments in DImode registers(!), i.e.
using modes of different sizes.  This is easily handled/supported
by first converting to an integer mode of the correct size, and
then generating a complex mode SUBREG of this.  This is similar
in concept to the patch I proposed here:
https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590139.html

2020-06-10  Roger Sayle  

gcc/ChangeLog
PR rtl-optimization/7061
* expr.cc (emit_group_store): For groups that consist of a single
scalar integer register that hold a complex mode value, use
gen_lowpart to generate a SUBREG to "view_convert" to the complex
mode.  For modes of different sizes, first convert to an integer
mode of the appropriate size.

gcc/testsuite/ChangeLog
PR rtl-optimization/7061
* gcc.target/i386/pr7061-1.c: New test case.
* gcc.target/i386/pr7061-2.c: New test case.

[Bug rtl-optimization/7061] Access of bytes in struct parameters

2022-05-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=7061

--- Comment #8 from CVS Commits  ---
The master branch has been updated by Roger Sayle :

https://gcc.gnu.org/g:1ad584d538d349db13cfa8440222d91d5e9aff3f

commit r13-859-g1ad584d538d349db13cfa8440222d91d5e9aff3f
Author: Roger Sayle 
Date:   Mon May 30 21:32:58 2022 +0100

Allow SCmode and DImode to be tieable with TARGET_64BIT on x86_64.

This patch is a form of insurance policy in case my patch for PR 7061 runs
into problems on non-x86 targets; the middle-end can add an extra check
that the backend is happy placing SCmode and DImode values in the same
register, before creating a SUBREG.  Unfortunately, ix86_modes_tieable_p
currently claims this is not allowed(?), even though the default target
hook for modes_tieable_p is to always return true [i.e. false can be
used to specifically prohibit bad combinations], and the x86_64 ABI
passes SCmode values in DImode registers!.  This makes the backend's
modes_tiable_p hook a little more forgiving, and additionally enables
interconversion between SCmode and V2SFmode, and between DCmode and
VD2Fmode, which opens interesting opporutunities in the future.

2022-05-30  Roger Sayle  

gcc/ChangeLog
* config/i386/i386.cc (ix86_modes_tieable_p): Allow SCmode to be
tieable with DImode on TARGET_64BIT, and SCmode tieable with
V2SFmode, and DCmode with V2DFmode.

[Bug rtl-optimization/7061] Access of bytes in struct parameters

2021-09-22 Thread gabravier at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=7061

Gabriel Ravier  changed:

   What|Removed |Added

 CC||gabravier at gmail dot com

--- Comment #7 from Gabriel Ravier  ---
Compiling this under ia64 seems to now be optimized perfectly as of at least
GCC 10, though the other ones look like they're still badly handled.

[Bug rtl-optimization/7061] Access of bytes in struct parameters

2006-02-08 Thread pluto at agmk dot net


--- Comment #6 from pluto at agmk dot net  2006-02-08 21:10 ---
one more testcase from x86-64 / gcc-4.1:

float re(float _Complex a) { return __real__ a; }

is compiled to:

re: movq%xmm0, -8(%rsp)
movss   -8(%rsp), %xmm0
ret

but can be optimized to `movss %xmm0, %xmm0`.


-- 

pluto at agmk dot net changed:

   What|Removed |Added

 CC||pluto at agmk dot net


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=7061



[Bug rtl-optimization/7061] Access of bytes in struct parameters

2005-11-02 Thread rth at gcc dot gnu dot org


--- Comment #5 from rth at gcc dot gnu dot org  2005-11-02 08:09 ---
And there is nothing Alpha specific about this.  Any target which passes
structures in registers can show it.  For instance, ia64:

f1:
.prologue
.body
.mmi
st2 [r12] = r32
nop 0
mov r14 = r12
;;
.mmi
ld1 r15 = [r14], 1
;;
ld1 r8 = [r14]
nop 0
;;
.mib
nop 0
add r8 = r8, r15
br.ret.sptk.many b0
.endp f1#
.align 16
.global f2#
.proc f2#
f2:
.prologue
.body
.mii
nop 0
extr.u r8 = r32, 8, 8
zxt1 r32 = r32
;;
.mib
nop 0
add r8 = r8, r32
br.ret.sptk.many b0


-- 

rth at gcc dot gnu dot org changed:

   What|Removed |Added

   GCC host triplet|alphapca56-*-linux-gnu  |
 GCC target triplet|alphapca56-*-linux-gnu  |
Summary|Alpha: Access of bytes in   |Access of bytes in struct
   |struct parameters   |parameters


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=7061