https://fedoraproject.org/wiki/LTOByDefault

== Summary ==
This is a proposal to enable link time optimization (LTO) of packages
built with rpmbuild by default.  It's an over-simplification, but
think of LTO as deferring analysis, optimization and code generation
until creation of an executable or dynamic shared object.

This is implemented by adding the option "-flto" the injected flags in
redhat-rpm-config.  There will be a simple way for packages to opt-out
of LTO.

== Owner ==
* Name: Jeff Law
* Email: l...@redhat.com

== Detailed Description ==
Programs built with rpmbuild and which honor flags injection via
redhat-rpm-config will be built with LTO by default.  A simple opt-out
mechanism will be provided for packages which use features that are
not LTO compatible.

The LTO bytecode itself will not be distributed as it is not stable
from one GCC release to the next.  This is enforced by stripping the
LTO bytecode from any installed .o/.a files.  We'll use bits SuSE has
already written for redhat-rpm-config to implement this.

Minor changes are desirable to the %configure macro in
redhat-rpm-config to fix common code idioms used by autoconf generated
scripts which are compromised by the additional optimization enabled
by LTO.  Minor updates to various packages will be needed to opt-out
of LTO or fix bugs exposed by LTO.

== Benefit to Fedora ==

The primary benefits of building with LTO enabled are smaller, faster
executables/DSOs.  A secondary benefit is LTO allows deeper analysis
of package source code at compile time which can improve various GCC
diagnostics and thus improve our ability to catch bugs at compile time
such as uninitialized objects, buffer overflows, unterminated strings,
restrict violations, etc.

This change also brings us back on-par with SuSE who enabled LTO by
default for their free distribution earlier in 2019.


== Scope ==
* Proposal owners:
The primary change is to redhat-rpm-config to add LTO to the default
compile/link flags as well as a conditional which allows easy opt-out
on a package by package basis.  Additionally the post-build scripts
need to strip the LTO bytecodes from any installed .o/.a files.

Additionally, we know there are many packages with configure scripts
that are compromised by LTO.  I have tweaks to the %configure macro in
redhat-rpm-config which fixes the vast majority of these problems with
a few simple sed scripts on the generated output.  Like the basic
support for injecting the LTO flags, this will require coordination
with the redhat-rpm-config maintainers.  Packages which call configure
directly and have compromised tests will need a one line change to
their .spec files to fix their configure scripts.

Some packages will need to opt-out of using LTO at this time.  The
most common case are packages that use symbol versioning or toplevel
ASM statements.  While there is a new mechanism to make LTO work with
symbol versioning, I don't think any packages have been updated to use
that mechanism.  This will require a one line change to 50-75 packages
(my script to find these is still running).

Finally, some packages will fail to build with LTO due to deeper
analysis for compile-time diagnostics catching programming mistakes
that have gone unnoticed until now.  I'll obviously be working with
package maintainers on all of these issues.

Note that even though the changes are fairly well localized in
redhat-rpm-config and a small number of packages, the real scope of
this change is much larger since it affects all packages in the
distribution that are compiled with GCC and which honor the flags
injection by redhat-rpm-config.


* Other developers:
As I mentioned, I'm happy to contact package owners that need to
modify their packages and suggest how their package needs to be fixed.
As a multi-decade GCC developer, I'm particularly well suited to
describe LTO, its limitations and how LTO impacts the diagnostics from
GCC to any package owner that needs additional information.

I'm also capable and available to address any GCC issues that we may
arise as a result of this change.  I don't expect much of the latter
as SuSE has already enabled this feature for their distribution and
thus weeded out most of the issues.

The highest level of coordination will be with the redhat-rpm-config
maintainers.

I will also be coordinating with the GDB team to address debugging
issues related to LTO.  The most important issue is to ensure that we
can pass the GDB testsuite with and without the -flto option being
enabled.    Failure to meet this goal would be considered a blocking
issue for LTO enablement.

I'm also already in contact with SuSE and Debian/Ununtu engineers to
discuss issues with gcc-10 with and without LTO.

We know there are some problems with debugging LTO code.  I will be
working with the GDB team to identify these issues and fix them either
in the debugger or compiler as needed.

I have prototype code for the required redhat-rpm-config changes and
I'll coordinate with the redhat-rpm-config maintainer to get them into
the desired final form.

I also know every package that fails with LTO enabled.  I'm still
categorizing those failures.  Many will ultimately need to use the
opt-out mechanism because they use features that are not compatible
with LTO.  I expect to have all this ready to go the first work week
of the new year.  I will coordinate with package owners to either add
the opt-out markers or fix issues in the package as needed.


* Release engineering: (a check of an impact with Release Engineering is needed)
Aside from the redhat-rpm-config changes, I do not expect any work
from releng to be necessary.  However, they need to be aware of the
change and who to contact in case of issues.

* Policies and guidelines: It would be useful to document how to
opt-out of LTO in the packaging guidelines.

* Trademark approval: N/A (not needed for this Change)


== Upgrade/compatibility impact ==
Should not affect compatibility.  Stripping of the LTO bytecode is
critical to ensure there are not long term compatibility issues.


== How To Test ==
In the short term, I'm happy to expose a repository with a gcc-10
snapshot and updated redhat-rpm-config.  Developers could then use
that repo to pick up gcc-10 and LTO optimizations for testing
purposes.  I'm already doing this internally for x86_64 and exposing
it to the world would be trivial.

Given such a repository, another developer would merely use that repo
when building their package.  No special hardware is needed.  The most
useful testing is first to identify FTBFS issues and get them
proactively fixed.  I'm happy to own that since I'm already doing that
for baseline gcc-10 issues as well as gcc-10 + LTO issues.

Doing the same testing on other architectures would definitely be
useful.  I'd be particularly concerned about large packages on the
32bit architectures.  I wouldn't be surprised if we find some packages
need to opt-out of LTO because they run out of memory at link/compile
time.   I'm already in contact with some Debian maintainers who want
to do testing around this issue as they're investigating a similar
change for Debian.

I'm already building all of Fedora with the weekly gcc-10 snapshots
(including LTO builds starting the week of 12/15).  This is primarily
to proactive find/address issues with the gcc-10 transition, but
verification of LTO state pretty much piggy backs for free on the
gcc-10 work.

== User Experience ==
In theory, the only noticeable difference to users would be smaller,
faster binaries and DSOs.  However, a developer that uses rpmbuild to
build their own code may see their package fail to build if it's got
errors or uses certain features that do not work with LTO.

Users who try to debug Fedora shipped executables could notice
differences in the debugging experience.

== Dependencies ==
None expected beyond addressing FTBFS issues and coordination between
GCC and GDB teams on any debugging issues we find over the next few
weeks.

== Contingency Plan ==
* Contingency mechanism: Revert the LTO flags injection
* Contingency deadline: Beta freeze, but shooting for prior to mass
rebuilds starting
* Blocks release? No
* Blocks product? No

Most critically, if we don't address the GDB testsuite issue noted
above, our fallback position would be to simply disable the LTO
injection globally and re-evaluate for Fedora 33, similarly if we were
to find some show-stopping LTO issue.

Otherwise the plan is to analyze the remaining 100-125 package build
failures.   These are likely a mixture of configure issues that can't
be trivially fixed via %configure, new diagnostics exposed by the
deeper analysis from LTO, and other small issues.

== Documentation ==
I would think we would want documentation on the opt-out method for RPM builds.

-- 
Ben Cotton
He / Him / His
Fedora Program Manager
Red Hat
TZ=America/Indiana/Indianapolis
_______________________________________________
devel-announce mailing list -- devel-announce@lists.fedoraproject.org
To unsubscribe send an email to devel-announce-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel-announce@lists.fedoraproject.org

Reply via email to