Hi, I've been playing some with the PGO build infrastructure and have a few changes I thought I'd share and get feedback on whether they're completely crazy or not. I'm not terribly familiar with the innards of the build infra, so would appreciate any comments and suggestions.
First, a recap of the current PGO build process -- please let me know if I'm wrong about anything: For a profiledbootstrap build, we replace normal stage{2,3,4} resepectively with an instrumented stageprofile, a non-instrumented stagetrain and finally a stagefeedback using the profile created when building stagetrain. I had two main goals in doing these changes: 1. profiledbootstrap does not do any comparison build, unlike the regular bootstrap, so it is possible that the end product is actually broken. Goal 1: try to incorporate this. 2. The profiling data comes from the stageprofile -> stagetrain build and that run does not include many optimization passes (at least by default at -O2) because those would only get enabled when profiling data is available. Goal 2: try to create a bootstrap target that would incorporate data from these passes. Goal 1: Comparison stage I started on goal 1 with the idea that if we built stagetrain with instrumentation as well, we could just compare stageprofile and stagetrain like we do with stage2/3. This runs into a few roadblocks however that I would appreciate if someone could comment on: a) profiling data starts getting generated while building stageprofile, since parts of that process involve running newly compiled executables. In the current build that doesn't cause any issues, we just throw that away and only use profiling data generated during the profile->train build, which should be extensive enough. This will not work if stagetrain is built with instrumentation, as it will be appending profiling data to the same files that it is using, during the train->feedback build. To resolve this, I changed the build to save profiling data in an external directory, so the two stages write profiling data into different places. Unfortunately, this results in the path to that location getting saved into each object file, which makes it impossible to compare them -- it should be possible to compare just the .text sections maybe, or pass different GCOV_PREFIX overrides for build vs host tools, but instead I decided to just add the possibility to rebuild stagefeedback a second time using the profile->train data and use that as the comparison, this should anyway be the right comparison to do as it would be of the final build product. It may also be possible to solve this by saving the profile data in the same place for the two stages, but make a copy of that to use for the train->feedback run but I haven't explored this yet. It will result in profiling data that is a mix of the stageprofile and stagetrain compilers but that might be okay given that they should be identical in control flow. b) I do get a few differences that are somewhat random: it looks like in some cases the second run arranges functions in a different order from the first run even though it is using the same profile data. Is this known/is there a way to prevent it? Goal 2: Second feedback stage Nothing special here, it builds a new stagefeedbackfull using the train->feedback profile. It does produce a different compiler so there's some effect but I haven't benchmarked improvements to see if it's measurably better. Testing was done on x86_64-pc-linux-gnu, with default configure settings except for --enable-languages=c,c++ --disable-werror. I've bootstrapped PGO with/without --with-build-config=bootstrap-lto. Summary of changes: a) Add three new stages -- feedbackcompare, feedbackfull, feedbackfullcompare with the two *compare stages to be used for comparing with the previous ones. Question about gcc/*/Make-lang.in: I see that these have rules at the end for, for eg c.stage*. Are these necessary or vestegial-- stagetrain is not there currently and I didn't add any of the new ones either. b) Modify stagetrain to be built instrumented, and change profiling output directories. Note that this is currently wasteful of build time if you're going to stop with profiledbootstrap, so perhaps this should be controlled via a build-config so it is enabled only for the *full bootstraps. c) Cleaned up bootstrap-lto{-lean}.mk a bit. It appears unnecessary to set all the individual stage flags -- if someone wants to customize them they can just override STAGE{2,3,4}_FLAGS to get the same effect. I also added STAGE4_CFLAGS in there, and added -frandom-seed=1 and do-compare3 in bootstrap-lto-lean in case the user wants to do a bootstrap4. For bootstrap-lto-noplugin.mk I noticed that the profiling stages were added but without -ffat-lto-objects, that should get fixed by the patch although it appears unlikely someone would be doing such a build. d) If one does a non-LTO PGO build currently, the LTO frontend doesn't get profiled. I modified the main Makefile to add the LTO flag during the generator build, similar to bootstrap-lto-lean.mk. For a c/c++ bootstrap the remaining unprofiled files that are warned about are mostly libiberty. The patch is attached, the top-level configure and Makefile.in need to be regenerated. Thank you.
diff --git a/Makefile.def b/Makefile.def index 1aab271d8aa..d4312c9de52 100644 --- a/Makefile.def +++ b/Makefile.def @@ -628,12 +628,23 @@ bootstrap_stage = { compare_target=compare3 ; bootstrap_target=bootstrap4 ; }; bootstrap_stage = { - id=profile ; prev=1 ; }; + id=profile ; prev=1 ; profilegen=profile ; }; bootstrap_stage = { - id=train; prev=profile ; } ; + id=train; prev=profile ; lean=1 ; profilegen=train ; } ; bootstrap_stage = { - id=feedback ; prev=train; + id=feedback ; prev=train; lean=profile ; profileuse=profile ; bootstrap_target=profiledbootstrap ; }; +bootstrap_stage = { + id=feedbackcompare ; prev=feedback; lean=train ; profileuse=profile ; + compare_target=comparefeedback ; + bootstrap_target=profiledbootstrapcompare ; }; +bootstrap_stage = { + id=feedbackfull ; prev=feedback; lean=train ; profileuse=train ; + bootstrap_target=profiledbootstrapfull ; }; +bootstrap_stage = { + id=feedbackfullcompare ; prev=feedbackfull; lean=feedback ; profileuse=train ; + compare_target=comparefeedbackfull ; + bootstrap_target=profiledbootstrapfullcompare ; }; bootstrap_stage = { id=autoprofile ; prev=1 ; autoprofile="$$s/gcc/config/i386/$(AUTO_PROFILE)" ; }; diff --git a/Makefile.tpl b/Makefile.tpl index 1cdc023c82f..229164da8b0 100644 --- a/Makefile.tpl +++ b/Makefile.tpl @@ -481,14 +481,23 @@ STAGE2_TFLAGS += -fno-checking STAGE3_CFLAGS += -fchecking=1 STAGE3_TFLAGS += -fchecking=1 -STAGEprofile_CFLAGS = $(STAGE2_CFLAGS) -fprofile-generate +STAGEprofile_CFLAGS = $(STAGE2_CFLAGS) -fprofile-exclude-files=conftest -fprofile-generate=$$r/$(HOST_SUBDIR)/profile-stageprofile STAGEprofile_TFLAGS = $(STAGE2_TFLAGS) -STAGEtrain_CFLAGS = $(filter-out -fchecking=1,$(STAGE3_CFLAGS)) -STAGEtrain_TFLAGS = $(filter-out -fchecking=1,$(STAGE3_TFLAGS)) +STAGEtrain_CFLAGS = $(filter-out -fchecking=1,$(STAGE3_CFLAGS)) -fprofile-exclude-files=conftest -fprofile-generate=$$r/$(HOST_SUBDIR)/profile-stagetrain +STAGEtrain_TFLAGS = $(filter-out -fchecking-1,$(STAGE3_TFLAGS)) -STAGEfeedback_CFLAGS = $(STAGE4_CFLAGS) -fprofile-use -STAGEfeedback_TFLAGS = $(STAGE4_TFLAGS) +[+ FOR bootstrap-stage +][+ IF profileuse +] +STAGE[+id+]_CFLAGS = $(STAGE4_CFLAGS) -fprofile-use=$$r/$(HOST_SUBDIR)/profile-stage[+profileuse+] +STAGE[+id+]_TFLAGS = $(STAGE4_TFLAGS) +[+ ENDIF profileuse +][+ ENDFOR bootstrap-stage +] + +# If we are building lto, but not using it during the build it will never get profiled. +# Force add lto flags to the generators (like in config/bootstrap-lto-lean.mk) +ifneq (,$(filter lto,@languages@)) +STAGEtrain_GENERATOR_CFLAGS += -flto=jobserver +STAGEfeedback_GENERATOR_CFLAGS += -flto=jobserver +endif STAGEautoprofile_CFLAGS = $(STAGE2_CFLAGS) -g STAGEautoprofile_TFLAGS = $(STAGE2_TFLAGS) @@ -498,6 +507,9 @@ STAGEautofeedback_TFLAGS = $(STAGE3_TFLAGS) do-compare = @do_compare@ do-compare3 = $(do-compare) +[+ FOR bootstrap-stage +][+ IF profileuse +][+ IF compare_target +] +do-[+compare_target+] = $(do-compare3) +[+ ENDIF compare_target +][+ ENDIF profileuse +][+ ENDFOR bootstrap-stage +] # ----------------------------------------------- # Programs producing files for the TARGET machine @@ -1657,6 +1669,13 @@ stage[+id+]-bubble:: [+ IF prev +]stage[+prev+]-bubble[+ ENDIF +] fi[+ IF compare-target +] $(MAKE) $(RECURSE_FLAGS_TO_PASS) [+compare-target+][+ ENDIF compare-target +] +[+ IF profilegen +] +.PHONY: clean-stage[+id+]-profile +clean-stage[+id+]: clean-stage[+id+]-profile +clean-stage[+id+]-profile: + rm -rf $(HOST_SUBDIR)/profile-stage[+id+] +[+ ENDIF profilegen +] + .PHONY: all-stage[+id+] clean-stage[+id+] do-clean: clean-stage[+id+] @@ -1736,7 +1755,8 @@ distclean-stage[+id+]:: @: $(MAKE); $(stage) @test "`cat stage_last`" != stage[+id+] || rm -f stage_last rm -rf stage[+id+]-* [+ - IF compare-target +][+compare-target+] [+ ENDIF compare-target +] + IF compare-target +][+compare-target+] [+ ENDIF compare-target +][+ + IF profilegen +]$(HOST_SUBDIR)/profile-stage[+id+] [+ ENDIF profilegen +] [+ IF cleanstrap-target +] .PHONY: [+cleanstrap-target+] @@ -1755,19 +1775,6 @@ distclean-stage[+id+]:: [+ ENDFOR bootstrap-stage +] -stageprofile-end:: - $(MAKE) distclean-stagefeedback - -stagefeedback-start:: - @r=`${PWD_COMMAND}`; export r; \ - s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \ - for i in prev-*; do \ - j=`echo $$i | sed s/^prev-//`; \ - cd $$r/$$i && \ - { find . -type d | sort | sed 's,.*,$(SHELL) '"$$s"'/mkinstalldirs "../'$$j'/&",' | $(SHELL); } && \ - { find . -name '*.*da' | sed 's,.*,$(LN) -f "&" "../'$$j'/&",' | $(SHELL); }; \ - done - @if gcc-bootstrap do-distclean: distclean-stage1 diff --git a/config/bootstrap-lto-lean.mk b/config/bootstrap-lto-lean.mk index 79cea50a4c6..e751779f992 100644 --- a/config/bootstrap-lto-lean.mk +++ b/config/bootstrap-lto-lean.mk @@ -1,10 +1,10 @@ # This option enables LTO for stage4 and LTO for generators in stage3 with profiledbootstrap. # Otherwise, LTO is used in only stage3. -STAGE3_CFLAGS += -flto=jobserver +STAGE3_CFLAGS += -flto=jobserver -frandom-seed=1 +STAGE4_CFLAGS += -flto=jobserver -frandom-seed=1 override STAGEtrain_CFLAGS := $(filter-out -flto=jobserver,$(STAGEtrain_CFLAGS)) STAGEtrain_GENERATOR_CFLAGS += -flto=jobserver -STAGEfeedback_CFLAGS += -flto=jobserver # assumes the host supports the linker plugin LTO_AR = $$r/$(HOST_SUBDIR)/prev-gcc/gcc-ar$(exeext) -B$$r/$(HOST_SUBDIR)/prev-gcc/ @@ -15,3 +15,5 @@ LTO_EXPORTS = AR="$(LTO_AR)"; export AR; \ LTO_FLAGS_TO_PASS = AR="$(LTO_AR)" RANLIB="$(LTO_RANLIB)" do-compare = /bin/true +do-compare3 = $(SHELL) $(srcdir)/contrib/compare-lto $$f1 $$f2 +extra-compare = gcc/lto1$(exeext) diff --git a/config/bootstrap-lto-noplugin.mk b/config/bootstrap-lto-noplugin.mk index 0f50708e49d..613e0dc09d1 100644 --- a/config/bootstrap-lto-noplugin.mk +++ b/config/bootstrap-lto-noplugin.mk @@ -3,7 +3,5 @@ STAGE2_CFLAGS += -flto=jobserver -frandom-seed=1 -ffat-lto-objects STAGE3_CFLAGS += -flto=jobserver -frandom-seed=1 -ffat-lto-objects -STAGEprofile_CFLAGS += -flto=jobserver -frandom-seed=1 -STAGEtrain_CFLAGS += -flto=jobserver -frandom-seed=1 -STAGEfeedback_CFLAGS += -flto=jobserver -frandom-seed=1 +STAGE4_CFLAGS += -flto=jobserver -frandom-seed=1 -ffat-lto-objects do-compare = /bin/true diff --git a/config/bootstrap-lto.mk b/config/bootstrap-lto.mk index 4de07e5b226..2f410e1b39a 100644 --- a/config/bootstrap-lto.mk +++ b/config/bootstrap-lto.mk @@ -1,10 +1,8 @@ -# This option enables LTO for stage2 and stage3 in slim mode +# This option enables LTO for stage2 onward in slim mode STAGE2_CFLAGS += -flto=jobserver -frandom-seed=1 STAGE3_CFLAGS += -flto=jobserver -frandom-seed=1 -STAGEprofile_CFLAGS += -flto=jobserver -frandom-seed=1 -STAGEtrain_CFLAGS += -flto=jobserver -frandom-seed=1 -STAGEfeedback_CFLAGS += -flto=jobserver -frandom-seed=1 +STAGE4_CFLAGS += -flto=jobserver -frandom-seed=1 # assumes the host supports the linker plugin LTO_AR = $$r/$(HOST_SUBDIR)/prev-gcc/gcc-ar$(exeext) -B$$r/$(HOST_SUBDIR)/prev-gcc/ diff --git a/configure.ac b/configure.ac index 9db4fd14aa2..ab31ca5f541 100644 --- a/configure.ac +++ b/configure.ac @@ -2112,6 +2112,8 @@ Supported languages are: ${potential_languages}]) fi AC_SUBST(stage1_languages) + languages=`echo "$enable_languages" | sed -e "s/,/ /g"` + AC_SUBST(languages) ac_configure_args=`echo " $ac_configure_args" | sed -e "s/ '--enable-languages=[[^ ]]*'//g" -e "s/$/ '--enable-languages="$enable_languages"'/" ` fi