Wiki - https://fedoraproject.org/wiki/Changes/ReproduciblePackageBuilds Discussion.fpo - https://discussion.fedoraproject.org/t/f41-change-proposal-reproducible-package-builds-system-wide/112740
== Summary == A post-build cleanup is integrated into the RPM build process so that common causes of build irreproducibility in packages are removed, making most of Fedora packages reproducible. == Owner == * Name: Davide Cavalca * Name: Neil Hanlon * Name: [[User:churchyard|Miro Hrončok]] * Name: [[User:zbyszek|Zbigniew Jędrzejewski-Szmek]] * Email: dcava...@fedoraproject.org * Email: neil at shrug.pw * Email: mhroncok at redhat.com * Email: zbyszek at in.waw.pl == Detailed Description == As of 2023 there is an active effort to implement [https://docs.fedoraproject.org/en-US/reproducible-builds/ Reproducible builds] in Fedora. Reproducible builds will allow our users to be able to independently verify that the RPMs have not been tampered with (either maliciously or via hardware/software fault): someone can do an independent rebuild of a package and confirm that they get identical binaries when building with the same versions of the compiler and other tools. This Change allows us to move forward in this direction by removing the common sources of irreproducibility. [https://github.com/keszybz/add-determinism add-determinism] is a Rust program which, as its name suggests, adds determinism to files that are given as input by attempting to standardize metadata contained in binary or source files to ensure consistency and clamping to $SOURCE_DATE_EPOCH in all instances. `add-determinism` is the "Fedora version" of [https://salsa.debian.org/reproducible-builds/strip-nondeterminism strip-nondeterminism] from the Debian project. Since strip-nondeterminism is written in perl, it is undesirable for use in Fedora, as we don't want to pull perl in the buildroot for every package. It's worth noting that this Change does not intend to impose any specific reproducibility requirements on Fedora packages. Once this Change is implemented and we have been through a mass rebuild and can verify that the common causes of irreproducibility have indeed been removed, we can consider further steps. But that will be at least one release later. This change does add a small amount of time to the processing of RPMs at the end of a build. Accordingly, packages containing large quantities or sizes of files can be slower, but this effect is not expected to be noticeable. `add-determinism` takes steps to ensure it does not interfere with other buildroot post processors like `mangle-shebangs`, `python-hardlink`, `python-bytecompile`. It defaults to not doing any modifications in case it doesn't understand the input file or there are any other problems. A mechanism to opt-out will be provided: to either completely disable the postprocessing step or to disable specific "handlers" (i.e. implementations of cleanup for specific file types, for example static archives). See [https://github.com/keszybz/add-determinism/blob/main/rpm/macros.build-reproducibility macros.build-reproducibility]. === Related Changes === * [https://fedoraproject.org/wiki/Changes/ReproducibleBuildsClampMtimes Clamp build mtimes to SOURCE_DATE_EPOCH] * [https://fedoraproject.org/wiki/Changes/RPM-4.20 RPM 4.20] — this pulls in changes to `%autosetup -S git` which removed a source of irreproducibility. == Feedback == == Benefit to Fedora == Adding determinism (i.e., removing non-determinsim) enables the Fedora community to have confidence that, if given the same source code, build environment, build instructions, and metadata from the build artifacts, any party can recreate copies of the artifacts that are identical except for the signatures and some parts of metadata. Reproducibility of builds leads to packages of higher quality. It turns out that quite often those irreproducible bits are caused by an error or sloppiness in the code. In particular, any dependence on architecture in noarch packages is almost always unwanted and/or a bug. Test builds that check reproducibility will expose such instances. Reproducibility of builds makes it easier to develop packages: when a small change is made and a package is rebuilt (in the same environment), then with a reproducible package, the only difference is directly caused by the change. If the package is different every time it is rebuilt, making a comparison is much harder. Build reproducibility for noarch ''sub''packages solves the problem where package builds on different architectures are different, causing mock to reject the whole build. In particular, this issue occurs for [https://docs.fedoraproject.org/en-US/packaging-guidelines/Python_Appendix/#_byte_compilation_reproducibility pyc files]. This will now be solved without requiring opt-in from individual packages. == Scope == * Proposal Owners: ** Integrate `add-determinism` as a BuildRoot Policy script ** Add a dependency on `marshalparser` to `python3` (probably conditionalized on `rpm-build`) * Other Developers: ** Test their packages with the additional phase, report problems ** Potentially integrate changes to packages to enable reproducibility * Release Engineering: Ideally we want this to happen before the mass rebuild, but that is not strictly required. * Policies and Guidelines: Fedora Packaging Guidelines should be updated to include information on the add-determinism BuildRoot Policy. User documentation should be amended to include instructions on how to verify reproducibility for a given package, and what packages are known to be non-reproducible, and how to opt-out. * Trademark approval: N/A (not needed for this Change) * Alignment with Community Initiatives: All software and requests are consistent with the decision process and similar across other groups in Fedora. The Fedora Reproducibility Working group began at Flock 2023 in Cork. == Upgrade/compatibility impact == No impact is expected. == How To Test == To test on the level of individual files: * install `add-determinism` * call `SOURCE_DATE_EPOCH=… add-determinism -v ./path/to/file` To test package builds: * build a local copy of `redhat-rpm-config` with https://src.fedoraproject.org/rpms/redhat-rpm-config/pull-request/293 * install `add-determinism` * build packages ;) (This can be done on a normal system or in a mock chroot.) == User Experience == No impact is expected. == Dependencies == == Contingency Plan == * Contingency mechanism: ** In case of major problems, disable the change in `redhat-rpm-config`. ** In case of problems with specific packages, opt-out by setting a macro. * Contingency deadline: No limit really. * Blocks release? No. == Documentation == * [https://docs.fedoraproject.org/en-US/reproducible-builds/ Fedora Reproducible Builds] * [https://github.com/keszybz/add-determinism/blob/main/rpm/macros.build-reproducibility add-determinism macros.build-reproducibility] * [https://github.com/keszybz/add-determinism/tree/main?tab=readme-ov-file#build-postprocessor-to-reset-metadata-fields-for-build-reproducibility add-determinism README] == Release Notes == Fedora package builds are now more deterministic, bringing the distribution closer to the goal of achieving fully reproducible builds for all of its packages. -- Aoife Moloney Fedora Operations Architect Fedora Project Matrix: @amoloney:fedora.im IRC: amoloney -- _______________________________________________ devel-announce mailing list -- devel-announce@lists.fedoraproject.org To unsubscribe send an email to devel-announce-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel-announce@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue