------- Comment #4 from nospamname at web dot de 2009-06-16 11:06 ------- i get report of more info about -funswitch-loops
The -funswitch-loops Option seem work on gcc 4.3.0 and above not good for speed.It generate much larger code(wma123) and code is slower in many case (try out ffmpeg H264 decode)i get report from a Athlon 2600+ with single channel ram running amiga emulator. But on my System use a AMD64 3000+ and Dual Channel ram running amiga emulator -funswitch-loops cause only large files but no slowdown. but i guess on a real 68k CPU without 2. level cache, -funswitch-loops is more not optimal. gcc 3.4.0 have too this option set on -O3 or i am wrong ? and here the speed is better. Is there a way to tweak some values on backend for specific CPU so -funswitch-loops works 3.4.0(maybe unroll not so much loops ? for now best solution for speed (H264 decode work on the system with single Channel ram same ot little faster as 3.4.0 build.) is let disable -funswitch-loops disable as far i get speedvalue reports. here are some values that show too slowdown on compilers 4.2.4 and 4.3.0 but on X86 http://multimedia.cx/eggs/compiler-performance-profiling-with-ffmpeg/ The GCC 4.4.0 does allow change the compiler switches in sourceline. maybe somebody can tell if can get more precise information with that feature, what codeline cause the fault. the furtherdevelop of Blitz basic (called amiblitz)can for example can set optimize on every sourceline.this help lot to find sourceline that make problem in peephole optimizer. But Amiblitz of course have not such a longlooking ahead peephole optimizer as gcc. but maybe this gcc 4.4.0 feature can help in any way,to get more precise error place ? I can then test and report more. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40414