[Bug rtl-optimization/7061] Access of bytes in struct parameters
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=7061 --- Comment #11 from CVS Commits --- The master branch has been updated by Roger Sayle : https://gcc.gnu.org/g:64d4f27a0ce47e97867512bda7fa5683acf8a134 commit r13-1282-g64d4f27a0ce47e97867512bda7fa5683acf8a134 Author: Roger Sayle Date: Mon Jun 27 07:47:40 2022 +0100 Implement __imag__ of float _Complex using shufps on x86_64. This patch is a follow-up improvement to my recent patch for PR rtl-optimization/7061. That patch added the test case gcc.target/i386/pr7061-2.c: float im(float _Complex a) { return __imag__ a; } For which GCC on x86_64 currently generates: movq%xmm0, %rax shrq$32, %rax movd%eax, %xmm0 ret but with this patch we now generate (the same as LLVM): shufps $85, %xmm0, %xmm0 ret This is achieved by providing a define_insn_and_split that allows truncated lshiftrt:DI by 32 to be performed on either SSE or general regs, where if the register allocator prefers to use SSE, we split to a shufps_v4si, or if not, we use a regular shrq. 2022-06-27 Roger Sayle gcc/ChangeLog PR rtl-optimization/7061 * config/i386/i386.md (*highpartdisi2): New define_insn_and_split. gcc/testsuite/ChangeLog PR rtl-optimization/7061 * gcc.target/i386/pr7061-2.c: Update to look for shufps.
[Bug rtl-optimization/7061] Access of bytes in struct parameters
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=7061 Dávid Bolvanský changed: What|Removed |Added CC||david.bolvansky at gmail dot com --- Comment #10 from Dávid Bolvanský --- llvm emits just: im: # @im shufps xmm0, xmm0, 85 # xmm0 = xmm0[1,1,1,1] ret
[Bug rtl-optimization/7061] Access of bytes in struct parameters
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=7061 --- Comment #9 from CVS Commits --- The master branch has been updated by Roger Sayle : https://gcc.gnu.org/g:1753a7120109c1d3b682f9487d6cca64fb2f0929 commit r13-1038-g1753a7120109c1d3b682f9487d6cca64fb2f0929 Author: Roger Sayle Date: Fri Jun 10 15:14:23 2022 +0100 PR rtl-optimization/7061: Complex number arguments on x86_64-like ABIs. This patch addresses the issue in comment #6 of PR rtl-optimization/7061 (a four digit PR number) from 2006 where on x86_64 complex number arguments are unconditionally spilled to the stack. For the test cases below: float re(float _Complex a) { return __real__ a; } float im(float _Complex a) { return __imag__ a; } GCC with -O2 currently generates: re: movq%xmm0, -8(%rsp) movss -8(%rsp), %xmm0 ret im: movq%xmm0, -8(%rsp) movss -4(%rsp), %xmm0 ret with this patch we now generate: re: ret im: movq%xmm0, %rax shrq$32, %rax movd%eax, %xmm0 ret [Technically, this shift can be performed on %xmm0 in a single instruction, but the backend needs to be taught to do that, the important bit is that the SCmode argument isn't written to the stack]. The patch itself is to emit_group_store where just before RTL expansion commits to writing to the stack, we check if the store group consists of a single scalar integer register that holds a complex mode value; on x86_64 SCmode arguments are passed in DImode registers. If this is the case, we can use a SUBREG to "view_convert" the integer to the equivalent complex mode. An interesting corner case that showed up during testing is that x86_64 also passes HCmode arguments in DImode registers(!), i.e. using modes of different sizes. This is easily handled/supported by first converting to an integer mode of the correct size, and then generating a complex mode SUBREG of this. This is similar in concept to the patch I proposed here: https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590139.html 2020-06-10 Roger Sayle gcc/ChangeLog PR rtl-optimization/7061 * expr.cc (emit_group_store): For groups that consist of a single scalar integer register that hold a complex mode value, use gen_lowpart to generate a SUBREG to "view_convert" to the complex mode. For modes of different sizes, first convert to an integer mode of the appropriate size. gcc/testsuite/ChangeLog PR rtl-optimization/7061 * gcc.target/i386/pr7061-1.c: New test case. * gcc.target/i386/pr7061-2.c: New test case.
[Bug rtl-optimization/7061] Access of bytes in struct parameters
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=7061 --- Comment #8 from CVS Commits --- The master branch has been updated by Roger Sayle : https://gcc.gnu.org/g:1ad584d538d349db13cfa8440222d91d5e9aff3f commit r13-859-g1ad584d538d349db13cfa8440222d91d5e9aff3f Author: Roger Sayle Date: Mon May 30 21:32:58 2022 +0100 Allow SCmode and DImode to be tieable with TARGET_64BIT on x86_64. This patch is a form of insurance policy in case my patch for PR 7061 runs into problems on non-x86 targets; the middle-end can add an extra check that the backend is happy placing SCmode and DImode values in the same register, before creating a SUBREG. Unfortunately, ix86_modes_tieable_p currently claims this is not allowed(?), even though the default target hook for modes_tieable_p is to always return true [i.e. false can be used to specifically prohibit bad combinations], and the x86_64 ABI passes SCmode values in DImode registers!. This makes the backend's modes_tiable_p hook a little more forgiving, and additionally enables interconversion between SCmode and V2SFmode, and between DCmode and VD2Fmode, which opens interesting opporutunities in the future. 2022-05-30 Roger Sayle gcc/ChangeLog * config/i386/i386.cc (ix86_modes_tieable_p): Allow SCmode to be tieable with DImode on TARGET_64BIT, and SCmode tieable with V2SFmode, and DCmode with V2DFmode.
[Bug rtl-optimization/7061] Access of bytes in struct parameters
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=7061 Gabriel Ravier changed: What|Removed |Added CC||gabravier at gmail dot com --- Comment #7 from Gabriel Ravier --- Compiling this under ia64 seems to now be optimized perfectly as of at least GCC 10, though the other ones look like they're still badly handled.
[Bug rtl-optimization/7061] Access of bytes in struct parameters
--- Comment #6 from pluto at agmk dot net 2006-02-08 21:10 --- one more testcase from x86-64 / gcc-4.1: float re(float _Complex a) { return __real__ a; } is compiled to: re: movq%xmm0, -8(%rsp) movss -8(%rsp), %xmm0 ret but can be optimized to `movss %xmm0, %xmm0`. -- pluto at agmk dot net changed: What|Removed |Added CC||pluto at agmk dot net http://gcc.gnu.org/bugzilla/show_bug.cgi?id=7061
[Bug rtl-optimization/7061] Access of bytes in struct parameters
--- Comment #5 from rth at gcc dot gnu dot org 2005-11-02 08:09 --- And there is nothing Alpha specific about this. Any target which passes structures in registers can show it. For instance, ia64: f1: .prologue .body .mmi st2 [r12] = r32 nop 0 mov r14 = r12 ;; .mmi ld1 r15 = [r14], 1 ;; ld1 r8 = [r14] nop 0 ;; .mib nop 0 add r8 = r8, r15 br.ret.sptk.many b0 .endp f1# .align 16 .global f2# .proc f2# f2: .prologue .body .mii nop 0 extr.u r8 = r32, 8, 8 zxt1 r32 = r32 ;; .mib nop 0 add r8 = r8, r32 br.ret.sptk.many b0 -- rth at gcc dot gnu dot org changed: What|Removed |Added GCC host triplet|alphapca56-*-linux-gnu | GCC target triplet|alphapca56-*-linux-gnu | Summary|Alpha: Access of bytes in |Access of bytes in struct |struct parameters |parameters http://gcc.gnu.org/bugzilla/show_bug.cgi?id=7061