http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55309



--- Comment #23 from Kostya Serebryany <kcc at gcc dot gnu.org> 2013-02-07 
05:01:53 UTC ---

with the patch from comment 22 (all benchmarks, ref data): 

                           orig          patched

       400.perlbench,        -1.00,      1244.00,     -1244.00

           401.bzip2,      1189.00,      1137.00,         0.96

             403.gcc,       754.00,       750.00,         0.99

             429.mcf,       611.00,       610.00,         1.00

           445.gobmk,      1211.00,      1167.00,         0.96

           456.hmmer,      1834.00,      1501.00,         0.82

           458.sjeng,      1353.00,      1288.00,         0.95

      462.libquantum,       478.00,       480.00,         1.00

         464.h264ref,      1880.00,      1836.00,         0.98

         471.omnetpp,       621.00,       621.00,         1.00

           473.astar,       766.00,       763.00,         1.00

       483.xalancbmk,       515.00,       517.00,         1.00

            433.milc,       631.00,       625.00,         0.99

            444.namd,       538.00,       538.00,         1.00

          447.dealII,       716.00,       719.00,         1.00

          450.soplex,       421.00,       415.00,         0.99

          453.povray,       433.00,       429.00,         0.99

             470.lbm,       415.00,       411.00,         0.99

         482.sphinx3,      1377.00,      1343.00,         0.98



The average speedup is similar to what we saw with equivalent optimization in

clang. Strangely, 400.perlbench fails with a warning when built with trunk but

passes with this patch. I did not investigate this further yet.



If we are looking for greater speedup we need to perform more comprehensive 

research. I have two wild guesses (not supported by any data). 



#1 afaict, the asan pass happens in the middle of the gcc optimization flow.

imho it should happen as late as possible so that the instrumentation 

happens on fully optimized code. 

#2 asan speed is very sensitive to quality of regalloc. It would be interesting

(and useful anyway) to implement zero-offset-shadow

(https://code.google.com/p/address-sanitizer/wiki/ZeroBasedShadow)

and see how much it helps with performance. 

If more than clang's 5% -- we have issues with regalloc, otherwise see #1

Reply via email to