http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51233
Bug #: 51233 Summary: [ipa-iterations] running multiple passes of early IPA on zlib produces more optimal code Classification: Unclassified Product: gcc Version: 4.7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end AssignedTo: unassig...@gcc.gnu.org ReportedBy: m...@use.net Using current trunk, with Maxim's eipa-iterations patch. I modified the zlib 1.2.3.4 makefile (from Ubuntu 11.10's source package) as such for building on my Ubuntu 11.10/amd64 system: CC=gcc CFLAGS=--param eipa-iterations=3 -flto -Ofast SFLAGS=$(CFLAGS) -shared -fPIC LDFLAGS=-flto -L. libz.a And then built and tested the resulting minigzip utility both at the macro-level (total runtime), and the micro-level (using callgrind's cache miss and branch misprediction benchmarks). Macro level, when run a single 50MB file on a ramdisk in single user mode shows minor improvements that may qualify as noise. At the micro level, callgrind shows 0.4% fewer branch mispredictions, and a dramatic decrease in data accesses (but a slight uptick in data cache misses). While there are some notable code differences between 2 and 3 iterations, they don't appear to have an effect on the performance at the macro- or micro-level. Given the relative simplicity of the code in the library, these additional optimizations could possibly have been gotten within a single iteration.