http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54776



             Bug #: 54776

           Summary: [4.8 Regression] tramp3d-v4: 20% performance

                    regression using -O3

    Classification: Unclassified

           Product: gcc

           Version: 4.8.0

            Status: UNCONFIRMED

          Severity: normal

          Priority: P3

         Component: tree-optimization

        AssignedTo: unassig...@gcc.gnu.org

        ReportedBy: mar...@trippelsdorf.de





With gcc-4.8 (--enable-checking=release):



markus@x4 ~ % time c++ -w -O3 tramp3d-v4.cpp

c++ -w -O3 tramp3d-v4.cpp  24.87s user 0.34s system 99% cpu 25.293 total

markus@x4 ~ % ./a.out --cartvis 1.0 0.0 --rhomin 1e-8 -n 20

...

Time spent in iteration: 7.35642



With gcc-4.7.2:



markus@x4 ~ % time c++ -w -O3 tramp3d-v4.cpp

c++ -w -O3 tramp3d-v4.cpp  25.15s user 0.33s system 99% cpu 25.568 total

markus@x4 ~ % ./a.out --cartvis 1.0 0.0 --rhomin 1e-8 -n 20

...

Time spent in iteration: 5.81199



LTO doesn't help much (gcc-4.8):



markus@x4 ~ % time c++ -w -O3 -flto  tramp3d-v4.cpp

c++ -w -O3 -flto tramp3d-v4.cpp  45.78s user 0.95s system 99% cpu 47.012 total

markus@x4 ~ % ./a.out --cartvis 1.0 0.0 --rhomin 1e-8 -n 20

...

Time spent in iteration: 7.2111





(For comparison here are some clang results:



markus@x4 ~ % time clang++ -w -O3 tramp3d-v4.cpp

clang++ -w -O3 tramp3d-v4.cpp  14.67s user 0.12s system 99% cpu 14.874 total

markus@x4 ~ % ./a.out --cartvis 1.0 0.0 --rhomin 1e-8 -n 20

...

Time spent in iteration: 6.1923



markus@x4 ~ % time clang++ -w -O3 -flto tramp3d-v4.cpp

clang++ -w -O3 -flto tramp3d-v4.cpp  20.28s user 0.16s system 99% cpu 20.535

total

markus@x4 ~ % ./a.out --cartvis 1.0 0.0 --rhomin 1e-8 -n 20

...

Time spent in iteration: 4.47936



That's an almost 28% improvement due to -flto)

Reply via email to