https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70471

            Bug ID: 70471
           Summary: Superfluous move instructions in floating-point
                    instruction sequence
           Product: gcc
           Version: 5.3.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: schnetter at gmail dot com
  Target Milestone: ---

In a function consisting of a long chain of floating-point operations, GCC
5.3.0 on Darwin 15.4.0 targeting an "Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz"
with options

-march=native
-Ofast
-fopenmp

generates this sequence (line 6302 ff):

        vmovapd %ymm4, %ymm8
        vmovapd %ymm4, -10192(%rbp)
        vsubpd  %ymm1, %ymm8, %ymm0
        vmovapd -4496(%rbp), %ymm8

Am I right with my interpretation that the first vmovapd is strictly
superfluous? The register %ymm8 is set on the first line, is overwritten on the
last line, and is used only once in the third line, where the original value
%ymm4 is still available.

Additional comments:

This is not the only place where inspection by eye indicates superfluous move
instructions; these seem to occur in many places. As you see below, both the
input and the generated code are quite long. Most relevant for performance is
likely the cache footprint. Thus, superfluous instructions are worrisome.



Details:

$ uname -a
Darwin Redshift 15.4.0 Darwin Kernel Version 15.4.0: Fri Feb 26 22:08:05 PST
2016; root:xnu-3248.40.184~3/RELEASE_X86_64 x86_64

$
/Users/eschnett/src/spack/opt/spack/darwin-x86_64/gcc-4.2.1/gcc-5.3.0-j3ujojmhirf6t2mi5enfosb6545duy42/bin/g++
-v
Reading specs from
/Users/eschnett/src/spack/opt/spack/darwin-x86_64/gcc-4.2.1/gcc-5.3.0-j3ujojmhirf6t2mi5enfosb6545duy42/lib64/gcc/x86_64-apple-darwin15.4.0/5.3.0/specs
COLLECT_GCC=/Users/eschnett/src/spack/opt/spack/darwin-x86_64/gcc-4.2.1/gcc-5.3.0-j3ujojmhirf6t2mi5enfosb6545duy42/bin/g++
COLLECT_LTO_WRAPPER=/Users/eschnett/src/spack/opt/spack/darwin-x86_64/gcc-4.2.1/gcc-5.3.0-j3ujojmhirf6t2mi5enfosb6545duy42/libexec/gcc/x86_64-apple-darwin15.4.0/5.3.0/lto-wrapper
Target: x86_64-apple-darwin15.4.0
Configured with:
/Users/eschnett/src/spack/var/spack/stage/gcc-5.3.0-j3ujojmhirf6t2mi5enfosb6545duy42/gcc-5.3.0/configure
--prefix=/Users/eschnett/src/spack/opt/spack/darwin-x86_64/gcc-4.2.1/gcc-5.3.0-j3ujojmhirf6t2mi5enfosb6545duy42
--libdir=/Users/eschnett/src/spack/opt/spack/darwin-x86_64/gcc-4.2.1/gcc-5.3.0-j3ujojmhirf6t2mi5enfosb6545duy42/lib64
--disable-multilib --enable-languages=fortran,c,java,objc,c++
--with-mpc=/Users/eschnett/src/spack/opt/spack/darwin-x86_64/gcc-4.2.1/mpc-1.0.3-kg7pswhyszxa6vbgohqjhy2pywb76gpc
--with-mpfr=/Users/eschnett/src/spack/opt/spack/darwin-x86_64/gcc-4.2.1/mpfr-3.1.4-taib2hirt72ggnirqb2brytc4cvp2igf
--with-gmp=/Users/eschnett/src/spack/opt/spack/darwin-x86_64/gcc-4.2.1/gmp-6.1.0-ld7rtqn2neg3z47mzg2vnexqeet4pz3i
--enable-lto --with-quad
--with-isl=/Users/eschnett/src/spack/opt/spack/darwin-x86_64/gcc-4.2.1/isl-0.14-cn4dbzoocjsf2a5jwamnhnverh2hwccr
Thread model: posix
gcc version 5.3.0 (GCC) 

Compiled with:
$
/Users/eschnett/src/spack/opt/spack/darwin-x86_64/gcc-4.2.1/gcc-5.3.0-j3ujojmhirf6t2mi5enfosb6545duy42/bin/g++
-fopenmp -march=native -std=gnu++11 -Ofast -S
ML_BSSN_FD4_EvolutionInterior.cc.i

Pre-processed input "ML_BSSN_FD4_EvolutionInterior.cc.i" (3.5 MByte):
https://gist.github.com/eschnett/10bf0b2b1977348f3e15ae29db871bb0

Compiler output "ML_BSSN_FD4_EvolutionInterior.cc.s" (470 kByte):
https://gist.github.com/eschnett/79d31fe08e9588d28763a9ad5c77ccfa

Reply via email to