Re: [Rpm-maint] [rpm-software-management/rpm] Make rpm builds more reproducible (Discussion #2654)
Bottom line: I'd like to have a reproducible "build environment" and a reproducible "build process" that ensures that these nuances can be avoided and the "build" will just work consistently every time. What I build: you can build without asking questions and vice versa. -- Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/discussions/2654#discussioncomment-9062381 You are receiving this because you are subscribed to this thread. Message ID: ___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Make rpm builds more reproducible (Discussion #2654)
"Ideally, the NEVRA would indicate the vendor via the %_dist suffix." - this does not handle builds for the different "EL" flavours. They **all** tend to use **the same value**: `.el7`, `.el8`, `.el9`, etc and then again this is not done consistently, there is no agreed specification of how/when this should be done. I've also seen issues like this when doing MySQL rpm rebuilds: https://github.com/sjmudd/mysql-rpm-builder/blob/main/config/ossetup__centos.9__8.0.33%2B.sh#L15-L25, when setting up a build environment different repos need to be used depending on which "EL" distribution you are using. What's worse is that the similarly named packages are not actually the same, so https://github.com/sjmudd/mysql-rpm-builder/blob/main/config/ossetup__centos.9__8.0.33%2B.sh#L59-L70 is required to build MySQL on CentOS 9 vs on Oracle Linux 9 (2 variations I was testing) as symlinking is different. This may be unusual but it's just an indication of RPM builds that "should work" but fail or really obscure reasons, precisely because the "reproducible builds is actually quite hard to do". Mainly it works: sometimes it doesn't and you need to dig really deep to figure out what triggers breakages like this. -- Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/discussions/2654#discussioncomment-9062339 You are receiving this because you are subscribed to this thread. Message ID: ___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Make rpm builds more reproducible (Discussion #2654)
Perhaps I shouldn't overload the issue but the other big problem I see with rpm building is the source of packages. All that `rpm` does is confirm the packages are installed for building `BuildRequires:` or for installing `Requires:` but a package name on its own is not helpful as many people may use multiple repos. `Yum/Dnf` have been able to configure and pull in rpms from external repos forever, but even if you do that the rpm specfile itself does not designate or confirm these sources so it may be impossible to reproduce a build as you can not find the actual rpm that was used for building or is required to install. When you take into account the mix of RHEL, CentOS, OEL, AlmaLinux, RockyLinux, SuSE and its derivatives etc this makes package maintenance, and builds or rebuilds much more complex. This ends up as being quite a mess if you ever want to build on similar systems (e.g. the RH variants) So having a way of addressing this better would be an enormous help. -- Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/discussions/2654#discussioncomment-8721990 You are receiving this because you are subscribed to this thread. Message ID: ___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Make rpm builds more reproducible (Issue #2590)
To your comment: > in some workflows people receive an srpm from somewhere and build that. my answer is precisely that. **When I try to rebuild the single package the process fails at the end of a 3-hour build run with an extremely obscure error message.** Yet somehow it seems to work with the upstream packager as the rpms are built and shared publicly. The upstream packager's build environment is not public and I've seen in some cases that to make the build work in some cases some "munging" of the build environment is needed. That's very messy. However, I think you understand my point of view. The intent of me creating this issue was to bring it up. It seems you're aware of the problem space and have shared that others also experience similar issues. Is there anything that can be done now? What should happen next? I do not think I can do anything right now and clearly any changes would be a long term effort. Can any further progress be made? -- Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/issues/2590#issuecomment-1667726799 You are receiving this because you are subscribed to this thread. Message ID: ___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Make rpm builds more reproducible (Issue #2590)
ok, so it depends on your point of view, that's clear. For downstream "repackagers" there's clearly internal macro "mangling" and perhaps build environment differences specifically due to that, and while that's clearly a very important use case it's not the same as mine. It may be that some of this behaviour needs to be optional, but again if the current rpm build process does not allow you to complete a rebuild correctly or it requires a large amount of investigation to work out how to achieve the end goal of reproducibly building packages then to some extent I think it's fragile. The link you provide just goes to show how fragile the current process is if you look at it in any detail, even if general building it seems to work. I suspect/know that things are more complex now than they were in the initial rpm days. -- Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/issues/2590#issuecomment-1667718170 You are receiving this because you are subscribed to this thread. Message ID: ___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Make rpm builds more reproducible (Issue #2590)
Also related to your comments about how rpm works with dependencies: I'm aware of rpm dependencies. I've been building rpm packages since RedHat 3.0.3 (that's in 1996). Also I'm not the owner of the rpms I want to rebuild so it's not a matter of me rebuilding my rpms it's also a matter of me figuring out how to rebuild others' rpms. I could certainly suggest the upstream packagers make changes to their packages but that's a longer term discussion and it requires them doing this explicitly. The nature of this issue I have created is have rpm do this for us automatically, so I don't have to figure out the details myself. -- Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/issues/2590#issuecomment-1667348812 You are receiving this because you are subscribed to this thread. Message ID: ___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Make rpm builds more reproducible (Issue #2590)
In the end **as a first step** I'd like to be able to reproduce the builds as close to what the original builder did. So I'd like that build to be reproducible in the way referenced in the URL. - One thing is to record the rpm macro values used during the build process. - another simple thing would be to record the full list of name / version of rpms installed on the build system. "rpm builds" are a bit of a pain as often we may use multiple repos. The build environment does not explicitly say where the `BuildRequires:` packages come from, so `rpm` itself is unable to pull down the needed packages in order to trigger the builds. `yum` or `dnf` could do that but it `rpm` doesn't know about this and the lack of integration between the two sets of tools has always somewhat surprised me though I guess that's a different story and it's water under the bridge now. I have changed the issue title to **make rpm builds more reproducible** as I think that's what I want and I think that by solving the 2 points above this would help a lot. I'm also aware that if you run `rpmbuild -bs .spec` there's no binary build process going on so I'd guess it's quite possible that macro definitions and installed package lists may not actually be useful. However, if you build source and binary packages together, which I'd assume is the correct thing to do, e.g. `rpmbuild -ba whatever.spec` then you will have the information needed and could save this inside the `src.rpm`. I think this would pretty much help solve my problem. My second desire, **which is NOT relevant to this issue**, is to then be able to modify the build config or sources or add patches so that the finally built packages provide extra features compared to the originally built packaging, yet if done correctly the result will be compatible with the original packages. You could think of this along the lines of building extra kernel modules in a separate rpm provided as part of the build process which can be installed and used on the upstream running base kernel. It's much easier to do this if you can reproduce the upstream build process first. -- Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/issues/2590#issuecomment-1667282383 You are receiving this because you are subscribed to this thread. Message ID: ___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Make rpm rebuilds more reproducible (Issue #2590)
I think you misunderstand. If you build for multiple OS versions. e.g. RHEL 7..9 then dependencies are different, so you can't explicitly as rpm works at the moment configure all the dependencies at the same time. This is done by adding additional macro processing to either be told the OS to build for .e.g. Oracle community MySQL rpms tend to use `--define 'rh7 1'` or `--define 'rh8 1'` or `--define 'rh9 1'` to provide the "hints" to build the package. This can't be done during the build process as `BuildRequires:` or `Requires:` tags are "static" by this time (evaluated prior to the build by `rpmbuild`. Yet to be reproducible you need to know which macros were provided by the user or at least are "not part of the base rpm/build setup" and therefore are "user configurable". Without that you may not be able to determine exactly the same "input parameters" to re-build a package in the same was as the original packager. Similarly if I want to patch this package with different/extra functionality I can not be sure that my build will represent a consistent change against the original sources, and thus the whole premise of repeatable builds falls apart. My question therefore was whether there are any plans to ensure that a built src rpm could be configured to include any rpmbuild time command line macros as metadata so that I can use that in theory to reproduce the build process faithfully. Reason for this coming up. I'm having problems reproducing the builds because it looks like the original build environment used by the original packagers may not be "pristine" to fix one specific build issue I had to build add some symlinks , e.g. https://github.com/sjmudd/mysql-rpm-builder/blob/main/config/prepare__centos.8__8.0.33.sh#L44-L50 to setup the OS in a way which the build process would complete without errors. That's clearly a very obscure example but it's real. So I'm looking at ways to be make it easier for the rpm packaging to be configured in such a way that issues such as this can be avoided and having a "suitable spec file" I really can repeatably build from scratch successfully. -- Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/issues/2590#issuecomment-1665499416 You are receiving this because you are subscribed to this thread. Message ID: ___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
[Rpm-maint] [rpm-software-management/rpm] Make rpm rebuilds more reproducible (Issue #2590)
I was looking at rebuilding some MySQL community rpms which are normally built by Oracle but doing this turns out to be surprisingly hard. The spec file, https://github.com/mysql/mysql-server/blob/trunk/packaging/rpm-oel/mysql.spec.in, uses a number of macros to defined various parts of the build process, `BuildRequires:` entries and so on depending on the OS being used. This works for RHEL 7..9 but of course should also work for CentOS 7..9 and other similar distros. However, to reproduce a build made by someone else you need to know the exact macro definitions when the rpms were built. Unless I'm mistaken if you build a package with `rpmbuild --define 'something 1' --define 'something_else 1' name.spec` then the actual command line arguments used to build the package are not explicitly recorded in the binary rpms but perhaps more importantly in the src.rpm which I think only contains the sources and the spec file used. If that's the case the lack of recording this information means that from a src rpm I may be unable to rebuild the binary rpms in the same way as the original packager. Is this assumption correct? If so would it make sense that the .src.rpm also included the command line defines (and anything else that might make sense) to simplify this task? I also notice that building any software depends on the installed software on the host/container where the build process runs, yet this is also not "registered". For rpm systems it might be convenient to also record the installed rpm package list as that would also be useful for reproducing the build environment appropriately. Outside of rpm itself is repo configuration which is OS dependent and it seems that RHEL/CentOS/OEL and the other RH-clones all do things slightly differently which makes rebuilds more complex. I guess that's outside of the scope of this issue. Why would improving this be useful if the source is provided? Simply because I may want to patch the originally built rpms in a specific way yet be sure that the rest of the build and packaging process is as close to the original packaging as before. Alternatively I may want to build a sub-module of the upstream packages which is compatible with the originally built packages and can be used without having to rebuild the whole upstream code again. So far I've not seen a way to make this process simpler and think the suggestions above, to include more information on the build command line arguments (and maybe macro values) and the installed package list, would help the rebuild process. Can something be done in this direction? For context I created: https://github.com/sjmudd/mysql-rpm-builder/ which was an attempt to simplify / document the reproducible rebuild process and it has turned out to be harder than originally anticipated. It is still work in progress but maybe gives some context to where the question comes from. -- Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/issues/2590 You are receiving this because you are subscribed to this thread. Message ID: ___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint