http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51233

             Bug #: 51233
           Summary: [ipa-iterations] running multiple passes of early IPA
                    on zlib produces more optimal code
    Classification: Unclassified
           Product: gcc
           Version: 4.7.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: middle-end
        AssignedTo: unassig...@gcc.gnu.org
        ReportedBy: m...@use.net


Using current trunk, with Maxim's eipa-iterations patch. I modified the zlib
1.2.3.4 makefile (from Ubuntu 11.10's source package) as such for building on
my Ubuntu 11.10/amd64 system:

CC=gcc
CFLAGS=--param eipa-iterations=3 -flto -Ofast
SFLAGS=$(CFLAGS) -shared -fPIC
LDFLAGS=-flto -L. libz.a

And then built and tested the resulting minigzip utility both at the
macro-level (total runtime), and the micro-level (using callgrind's cache miss
and branch misprediction benchmarks). Macro level, when run a single 50MB file
on a ramdisk in single user mode shows minor improvements that may qualify as
noise. At the micro level, callgrind shows 0.4% fewer branch mispredictions,
and a dramatic decrease in data accesses (but a slight uptick in data cache
misses).

While there are some notable code differences between 2 and 3 iterations, they
don't appear to have an effect on the performance at the macro- or micro-level.

Given the relative simplicity of the code in the library, these additional
optimizations could possibly have been gotten within a single iteration.

Reply via email to