Re: [Rpm-maint] [rpm-software-management/rpm] Reproducible builds improvements (Discussion #2934)
In case folks didn't notice the PR from @mlschroe : https://github.com/rpm-software-management/rpm/pull/2944 -- Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/discussions/2934#discussioncomment-8676851 You are receiving this because you are subscribed to this thread. Message ID: ___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Reproducible builds improvements (Discussion #2934)
I don't think it's a good idea to offer. I am not convinced these knobs are a good idea for RPM to expose for any reason, especially reproducibility. -- Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/discussions/2934#discussioncomment-8643933 You are receiving this because you are subscribed to this thread. Message ID: ___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Reproducible builds improvements (Discussion #2934)
I am aware of some tools that use `RPMTAG_BUILDTIME` to sort packages in various situations, especially if they have the same NVRA (ie. rebuilds). It is also useful in diagnostic purposes when trying to figure out a factor of breakage. I would rather not falsify this tag. -- Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/discussions/2934#discussioncomment-8643922 You are receiving this because you are subscribed to this thread. Message ID: ___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Reproducible builds improvements (Discussion #2934)
Yes, I think both are worthwhile. But they must be opt-in. -- Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/discussions/2934#discussioncomment-8643884 You are receiving this because you are subscribed to this thread. Message ID: ___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Reproducible builds improvements (Discussion #2934)
I think this all has drifted away from the initial proposal. The goal was to be able to improve reproducibility of a given rpm by: - adding a way to specify the buildtime - adding an option to clamp the file mtimes to the buildtime Disregarding the implementation details, do you all think this is worthwhile to have? -- Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/discussions/2934#discussioncomment-8643827 You are receiving this because you are subscribed to this thread. Message ID: ___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Reproducible builds improvements (Discussion #2934)
I did not mean to alter signing time - but keep it as it is (it is dropped by delsign anyway), while changing "Build Date" instead to something that does not vary in (changeless) rebuilds. -- Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/discussions/2934#discussioncomment-8641508 You are receiving this because you are subscribed to this thread. Message ID: ___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Reproducible builds improvements (Discussion #2934)
I think the signature must give the real date of when the signature was actually made. Setting a fake date would be very very icky, undermining the trust in the signing process and the holders of the signing key used in such a manner. At the more technical level, keys have a creation time, e.g. for Fedora the keys are created a few months in advance of the release (RPM-GPG-KEY-fedora-rawhide-x86_64 has Public key creation time - Tue Jan 24 22:22:52 CET 2023). This means that those keys cannot be used to create valid signatures for older packages, but at various points there certainly are packages that haven't been touched and have a SOURCE_DATE_EPOCH older than they key creation date. Also, at least in Fedora, packages are resigned with a newer signature for a new release. (E.g. a .f39 or .f40 package, when downloaded from the F41/rawhide repository, is not rebuilt, but is resigned with the F41 key.) So we *need* a signing time that is separate from BUILDTIME. -- Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/discussions/2934#discussioncomment-8641433 You are receiving this because you are subscribed to this thread. Message ID: ___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Reproducible builds improvements (Discussion #2934)
When I normalize BUILDTIME with `%use_source_date_epoch_as_buildtime`, the signature still gives the real date. Is there a value in keeping both? e.g. [this package](https://build.opensuse.org/package/show/home:bmwiedemann:reproducible/strip-nondeterminism) `rpm -ql` has ``` Signature : RSA/SHA256, 2024-02-26T12:00:49 UTC, Key ID 8adc26dbb49c2121 Source RPM : strip-nondeterminism-1.13.1-33.9.src.rpm Build Date : 2023-07-28T16:19:49 UTC ``` Not overriding BUILDHOST is fine as it still allows easy verification. -- Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/discussions/2934#discussioncomment-8640953 You are receiving this because you are subscribed to this thread. Message ID: ___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Reproducible builds improvements (Discussion #2934)
I've been bitten enough times personally that I would rather not have BUILDHOST and BUILDTIME set to fake values. -- Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/discussions/2934#discussioncomment-8630113 You are receiving this because you are subscribed to this thread. Message ID: ___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Reproducible builds improvements (Discussion #2934)
I don't think that a custom "rpmhash" tool is the problem. We have to "trust" the tools anyway… A tool that deletes signatures is as much an opaque binary as the tool that calculates some hash. I think it would a reasonable compromise to say that the hypothetical "rpmhash" tool must give a result that is identical to delsign+sha256sum. The problem is to agree on what exactly is stripped and/or skipped in the hash. FWIW, I've been going through Fedora rebuilds over the last few days, and there is clear value in having BUILDHOST set to a non-fake value. For example in https://bugzilla.redhat.com/show_bug.cgi?id=2266767#c4, if it was very helpful in diagnosing an arch-specific issue in a noarch package. -- Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/discussions/2934#discussioncomment-8630015 You are receiving this because you are subscribed to this thread. Message ID: ___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Reproducible builds improvements (Discussion #2934)
I'm always thinking about rebuild+compare as one operation. In the Debian and Archlinux space there were also discussions about centralized collections of multiple rebuilder-results. Those are signed data containing "$rebuildername built $package $version and got output $hash". That would work poorly with fuzzy-matching. It could work with a custom rpmhash tool, but how do you prove that it indeed covers all relevant bits? I don't like that and would rather see us reach bit-reproducible rpms (after delsign) that work with generic `sha256sum`. -- Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/discussions/2934#discussioncomment-8629486 You are receiving this because you are subscribed to this thread. Message ID: ___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Reproducible builds improvements (Discussion #2934)
If we could drop OPTFLAGS, that'd be great. -- Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/discussions/2934#discussioncomment-8623707 You are receiving this because you are subscribed to this thread. Message ID: ___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Reproducible builds improvements (Discussion #2934)
"Implementation detail". The important part is to get the payload and significant metadata to be identical. Once we have that, we can do optimizations to handle comparisons efficiently. One option is to strip fields and hash that. Another option, for example, would be to define a hash method where some fields are masked (simply skipped when hashing). In fact, I think that this second option is more efficient, because you only need to read the original archive once and don't even need to write a dummy rpm. > needing build outputs in addition to build inputs is still needing more We're only talking about using build outputs for the comparison. We don't need them for the rebuild itself. -- Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/discussions/2934#discussioncomment-8623344 You are receiving this because you are subscribed to this thread. Message ID: ___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Reproducible builds improvements (Discussion #2934)
Yes, but needing build outputs in addition to build inputs is still needing more. -- Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/discussions/2934#discussioncomment-8619915 You are receiving this because you are subscribed to this thread. Message ID: ___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Reproducible builds improvements (Discussion #2934)
You already need all the inputs to correctly reproduce packages in openSUSE. The build system doesn't capture this, but it's still required. -- Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/discussions/2934#discussioncomment-8618519 You are receiving this because you are subscribed to this thread. Message ID: ___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Reproducible builds improvements (Discussion #2934)
keszybz wrote: > any party can recreate copies of the artifacts that are identical except for > the signatures and parts of metadata I don't think it is a good idea to exclude metadata. One benefit that you can only get with bit-identical reproducibility is that you can list the one and only correct hash value of the build result. (that also works with signed rpms + delsign). However with weaker variants, you always need another full rpm to compare to. I.e. for our 16k packages, instead of publishing a list of 16k hashes you then need to keep the full archive (100GB) to allow people to reproduce any package. -- Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/discussions/2934#discussioncomment-8618492 You are receiving this because you are subscribed to this thread. Message ID: ___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Reproducible builds improvements (Discussion #2934)
Oh BTW, just a quick side-remark on this: > OPTFLAGS and PLATFORM are often different because a "random" noarch package > is selected OPTFLAGS shouldn't be even defined on noarch builds, much less included in the header. The former is hard to fix for various hysterical reasons, but the latter should be easy. -- Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/discussions/2934#discussioncomment-8618013 You are receiving this because you are subscribed to this thread. Message ID: ___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Reproducible builds improvements (Discussion #2934)
It's also important to keep in mind the context of Debian style reproducibility: their archive format is a tarball with ar archives inside. That makes things very different for them than us. -- Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/discussions/2934#discussioncomment-8617949 You are receiving this because you are subscribed to this thread. Message ID: ___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Reproducible builds improvements (Discussion #2934)
> I saw "reproducability" mentioned a few times. I assume it's not a typo, but > I have no idea how it's supposed to be different from "reproducibility". Eh. All my life I've been talking about reproducers, and reproducable bugs. And now builds. :flushed: That misspelling is going to be hard to unlearn, but thanks for setting me straight there. -- Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/discussions/2934#discussioncomment-8617916 You are receiving this because you are subscribed to this thread. Message ID: ___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Reproducible builds improvements (Discussion #2934)
> Wait, what? If those differ then the packages do differ, so its not actually > bit-per-bit identical. Which is what _I've_ assumed reproducability to mean. > This just goes to point out how completely different expectations people > have. No wonder having a meaningful discussion about reproducable packages > always seems so hard I wrote a long piece about this [here](https://discussion.fedoraproject.org/t/report-from-the-reproducible-builds-hackfest-during-flock-2023/87469). > Over the last years I just used `rpm --delsign` to compare with my > replication builds and was able to get bit-identical results Whether we skip some fields when doing a comparison, or take an rpm and strip those fields, and then do the comparison, is just an implementation detail. In practice, users get rpms that are signed. Thus, the format that the users are interested in checking is by definition the signed rpm. (The other end is interesting too. We generally talk about reproducibility in the sense of starting from srpms. This view originates in the Debian world where the source deb is the only common denominator. Packagers do not have to use git, they do not even have to use a vcs, and people do non-version-controlled binNMUs. Thus, when talking about the whole distro, starting from source debs is the only option. When working with rpms, at the technical level, getting the part from srpm until the binary rpm reproducible is challenging, so it makes sense for us to work on this part in the beginning. But what we actually want in the end is reproducibility of the **full pipeline**, i.e. starting from dist-git. I assume that adding the additional step where we generate the srpm from dist-git will be easy. And in dist-git, we want to have the upstream pristine tarballs, including a signature. In the end, ideally the user would be able to verify that the signed upstream tarball + a specific commit with our spec file leads to the rpms that they download from the mirror, reproducibly.) > Having a written definition of what "reproducability" means would help > driving towards that goal. I saw "reproducability" mentioned a few times. I assume it's not a typo, but I have no idea how it's supposed to be different from "reproducibility". Please see the link above for my definition of "reproducibility". -- Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/discussions/2934#discussioncomment-8617435 You are receiving this because you are subscribed to this thread. Message ID: ___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint