[Bug rtl-optimization/45498] New: Optimisations fail above arbitrary level of complexity

2010-09-02 Thread jonathan dot morton at movial dot com
Slightly increasing the complexity of a function can disproportionately
increase the size and runtime of the generated code.  This appears to be due to
the optimisers giving up on code blocks above a certain abstract size, and is
particularly severe on PPC and ARM, but is observable on ia32 and amd64 as
well.

This is a general problem which affects any large function, and has done since
at least gcc3 days - I first encountered it when trying to use Altivec
intrinsics.  In some cases manually moving a function call *out* of a loop
results in 4x the runtime, which is the opposite of normal expectations.

Attached is an example which demonstrates poor code generated after a long
series of inlining and dead code elimination stages.  Demonstration is on
PPC32, but the same example suffices for ARMv7-A as well.  An amd64 target
produces reasonable code for this example, but a fairly small complexity
increase causes a similar collapse.

The output is two functions, one generated from the tower of inlining, and the
other (with a manual_ prefix) after the same optimisations were performed
manually.  The quality of the latter is clearly better than the former, which
contains the following sequence in the inner loop:

fmuls 0,12,0
stw 4,32(1)
stw 3,16(1)
stw 3,20(1)
stfs 0,8(1)
lwz 4,8(1)
stw 4,24(1)
li 4,0
stw 4,36(1)
lfs 0,24(1)
fmadds 0,9,0,10

All of the stw's in the above fragment are dead, except the stw 4,24(1) which
merely shuffles the value from f0 through two memory locations and back to f0. 
The li 4,0 also demonstrates very poor register allocation, since r4 already
contains zero before this fragment.  In the manual variant, the fmuls is
immediately followed by the fmadds.

The same source file run through Clang on amd64 produces virtually identical
output for the two versions.


-- 
   Summary: Optimisations fail above arbitrary level of complexity
   Product: gcc
   Version: 4.4.3
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: jonathan dot morton at movial dot com
 GCC build triplet: powerpc-linux-gnu
  GCC host triplet: powerpc-linux-gnu
GCC target triplet: powerpc-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45498



[Bug rtl-optimization/45498] Optimisations fail above arbitrary level of complexity

2010-09-02 Thread jonathan dot morton at movial dot com


--- Comment #1 from jonathan dot morton at movial dot com  2010-09-02 11:40 
---
Created an attachment (id=21661)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=21661action=view)
Preprocessed source of test case.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45498



[Bug rtl-optimization/45498] Optimisations fail above arbitrary level of complexity

2010-09-02 Thread jonathan dot morton at movial dot com


--- Comment #2 from jonathan dot morton at movial dot com  2010-09-02 11:41 
---
Compiler output:

$ gcc -v -save-temps -mcpu=G3 -Wall -O3 -ffast-math -c isolated-src-a8-u.c
Using built-in specs.
Target: powerpc-unknown-linux-gnu
Configured with:
/var/tmp/portage/sys-devel/gcc-4.4.3-r2/work/gcc-4.4.3/configure --prefix=/usr
--bindir=/usr/powerpc-unknown-linux-gnu/gcc-bin/4.4.3
--includedir=/usr/lib/gcc/powerpc-unknown-linux-gnu/4.4.3/include
--datadir=/usr/share/gcc-data/powerpc-unknown-linux-gnu/4.4.3
--mandir=/usr/share/gcc-data/powerpc-unknown-linux-gnu/4.4.3/man
--infodir=/usr/share/gcc-data/powerpc-unknown-linux-gnu/4.4.3/info
--with-gxx-include-dir=/usr/lib/gcc/powerpc-unknown-linux-gnu/4.4.3/include/g++-v4
--host=powerpc-unknown-linux-gnu --build=powerpc-unknown-linux-gnu
--enable-altivec --disable-fixed-point --without-ppl --without-cloog
--enable-nls --without-included-gettext --with-system-zlib --disable-werror
--enable-secureplt --disable-multilib --enable-libmudflap --disable-libssp
--enable-libgomp
--with-python-dir=/share/gcc-data/powerpc-unknown-linux-gnu/4.4.3/python
--enable-checking=release --disable-libgcj --enable-objc-gc
--enable-languages=c,c++,objc,obj-c++,fortran --enable-shared
--enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu
--with-bugurl=http://bugs.gentoo.org/ --with-pkgversion='Gentoo 4.4.3-r2 p1.2'
Thread model: posix
gcc version 4.4.3 (Gentoo 4.4.3-r2 p1.2)
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-mcpu=G3' '-Wall' '-O3' '-ffast-math'
'-c'
 /usr/libexec/gcc/powerpc-unknown-linux-gnu/4.4.3/cc1 -E -quiet -v -D__unix__
-D__gnu_linux__ -D__linux__ -Dunix -D__unix -Dlinux -D__linux -Asystem=linux
-Asystem=unix -Asystem=posix isolated-src-a8-u.c -D_FORTIFY_SOURCE=2
-msecure-plt -mcpu=G3 -Wall -ffast-math -O3 -fpch-preprocess -o
isolated-src-a8-u.i
ignoring nonexistent directory /usr/local/include
ignoring nonexistent directory
/usr/lib/gcc/powerpc-unknown-linux-gnu/4.4.3/../../../../powerpc-unknown-linux-gnu/include
#include ... search starts here:
#include ... search starts here:
 /usr/lib/gcc/powerpc-unknown-linux-gnu/4.4.3/include
 /usr/lib/gcc/powerpc-unknown-linux-gnu/4.4.3/include-fixed
 /usr/include
End of search list.
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-mcpu=G3' '-Wall' '-O3' '-ffast-math'
'-c'
 /usr/libexec/gcc/powerpc-unknown-linux-gnu/4.4.3/cc1 -fpreprocessed
isolated-src-a8-u.i -msecure-plt -quiet -dumpbase isolated-src-a8-u.c -mcpu=G3
-auxbase isolated-src-a8-u -O3 -Wall -version -ffast-math -o
isolated-src-a8-u.s
GNU C (Gentoo 4.4.3-r2 p1.2) version 4.4.3 (powerpc-unknown-linux-gnu)
compiled by GNU C version 4.4.3, GMP version 4.3.2, MPFR version
2.4.2-p3.
GGC heuristics: --param ggc-min-expand=64 --param ggc-min-heapsize=63683
Compiler executable checksum: a8f353e88d0b3fd0803d2b037b563de0
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-mcpu=G3' '-Wall' '-O3' '-ffast-math'
'-c'

/usr/lib/gcc/powerpc-unknown-linux-gnu/4.4.3/../../../../powerpc-unknown-linux-gnu/bin/as
-mppc -many -V -Qy -o isolated-src-a8-u.o isolated-src-a8-u.s
GNU assembler version 2.20.1 (powerpc-unknown-linux-gnu) using BFD version (GNU
Binutils) 2.20.1.20100303
COMPILER_PATH=/usr/libexec/gcc/powerpc-unknown-linux-gnu/4.4.3/:/usr/libexec/gcc/powerpc-unknown-linux-gnu/4.4.3/:/usr/libexec/gcc/powerpc-unknown-linux-gnu/:/usr/lib/gcc/powerpc-unknown-linux-gnu/4.4.3/:/usr/lib/gcc/powerpc-unknown-linux-gnu/:/usr/libexec/gcc/powerpc-unknown-linux-gnu/4.4.3/:/usr/libexec/gcc/powerpc-unknown-linux-gnu/:/usr/lib/gcc/powerpc-unknown-linux-gnu/4.4.3/:/usr/lib/gcc/powerpc-unknown-linux-gnu/:/usr/lib/gcc/powerpc-unknown-linux-gnu/4.4.3/../../../../powerpc-unknown-linux-gnu/bin/
LIBRARY_PATH=/usr/lib/gcc/powerpc-unknown-linux-gnu/4.4.3/:/usr/lib/gcc/powerpc-unknown-linux-gnu/4.4.3/:/usr/lib/gcc/powerpc-unknown-linux-gnu/4.4.3/../../../../powerpc-unknown-linux-gnu/lib/:/usr/lib/gcc/powerpc-unknown-linux-gnu/4.4.3/../../../:/lib/:/usr/lib/
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-mcpu=G3' '-Wall' '-O3' '-ffast-math'
'-c'


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45498