Re: [Rpm-maint] [rpm-software-management/rpm] Make rpm builds more reproducible (Discussion #2654)

2024-04-09 Thread Simon J Mudd
Bottom line: I'd like to have a reproducible "build environment" and a 
reproducible "build process" that ensures that these nuances can be avoided and 
the "build" will just work consistently every time. What I build: you can build 
without asking questions and vice versa.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/discussions/2654#discussioncomment-9062381
You are receiving this because you are subscribed to this thread.

Message ID: 
___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] Make rpm builds more reproducible (Discussion #2654)

2024-04-09 Thread Simon J Mudd
"Ideally, the NEVRA would indicate the vendor via the %_dist suffix."

- this does not handle builds for the different "EL" flavours.  They **all** 
tend to use **the same value**: `.el7`, `.el8`, `.el9`, etc and then again this 
is not done consistently, there is no agreed specification of how/when this 
should be done.

I've also seen issues like this when doing MySQL rpm rebuilds: 
https://github.com/sjmudd/mysql-rpm-builder/blob/main/config/ossetup__centos.9__8.0.33%2B.sh#L15-L25,
 when setting up a build environment different repos need to be used depending 
on which "EL" distribution you are using.
What's worse is that the similarly named packages are not actually the same, so 
https://github.com/sjmudd/mysql-rpm-builder/blob/main/config/ossetup__centos.9__8.0.33%2B.sh#L59-L70
 is required to build MySQL on CentOS 9 vs on Oracle Linux 9 (2 variations I 
was testing) as symlinking is different. This may be unusual but it's just an 
indication of RPM builds that "should work" but fail or really obscure reasons, 
precisely because the "reproducible builds is actually quite hard to do". 
Mainly it works: sometimes it doesn't and you need to dig really deep to figure 
out what triggers breakages like this.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/discussions/2654#discussioncomment-9062339
You are receiving this because you are subscribed to this thread.

Message ID: 
___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] Make rpm builds more reproducible (Discussion #2654)

2024-03-08 Thread Simon J Mudd
Perhaps I shouldn't overload the issue but the other big problem I see with rpm 
building is the source of packages. All that `rpm` does is confirm the packages 
are installed for building `BuildRequires:` or for installing `Requires:` but a 
package name on its own is not helpful as many people may use multiple repos.  
`Yum/Dnf` have been able to configure and pull in rpms from external repos 
forever, but even if you do that the rpm specfile itself does not designate or 
confirm these sources so it may be impossible to reproduce a build as you can 
not find the actual rpm that was used for building or is required to install.  
When you take into account the mix of RHEL, CentOS, OEL, AlmaLinux, RockyLinux, 
SuSE and its derivatives etc this makes package maintenance, and builds or 
rebuilds much more complex.  This ends up as being quite a mess if you ever 
want to build on similar systems (e.g. the RH variants) So having a way of 
addressing this better would be an enormous help.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/discussions/2654#discussioncomment-8721990
You are receiving this because you are subscribed to this thread.

Message ID: 
___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] Make rpm builds more reproducible (Issue #2590)

2023-08-07 Thread Simon J Mudd
To your comment:

>  in some workflows people receive an srpm from somewhere and build that. 

my answer is precisely that. **When I try to rebuild the single package the 
process fails at the end of a 3-hour build run with an extremely obscure error 
message.**  Yet somehow it seems to work with the upstream packager as the rpms 
are built and shared publicly. The upstream packager's build environment is not 
public and I've seen in some cases that to make the build work in some cases 
some "munging" of the build environment is needed. That's very messy.

However, I think you understand my point of view. The intent of me creating 
this issue was to bring it up. It seems you're aware of the problem space and 
have shared that others also experience similar issues.

Is there anything that can be done now? What should happen next? I do not think 
I can do anything right now and clearly any changes would be a long term 
effort. Can any further progress be made?

-- 
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/2590#issuecomment-1667726799
You are receiving this because you are subscribed to this thread.

Message ID: ___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] Make rpm builds more reproducible (Issue #2590)

2023-08-07 Thread Simon J Mudd
ok, so it depends on your point of view, that's clear.  For downstream 
"repackagers" there's clearly internal macro "mangling" and perhaps build 
environment differences specifically due to that, and while that's clearly a 
very important use case it's not the same as mine.  It may be that some of this 
behaviour needs to be optional, but again if the current rpm build process does 
not allow you to complete a rebuild correctly or it requires a large amount of 
investigation to work out how to achieve the end goal of reproducibly building 
packages then to some extent I think it's fragile.
The link you provide just goes to show how fragile the current process is if 
you look at it in any detail, even if general building it seems to work. I 
suspect/know that things are more complex now than they were in the initial rpm 
days.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/2590#issuecomment-1667718170
You are receiving this because you are subscribed to this thread.

Message ID: ___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] Make rpm builds more reproducible (Issue #2590)

2023-08-07 Thread Simon J Mudd
Also related to your comments about how rpm works with dependencies:

I'm aware of rpm dependencies. I've been building rpm packages since RedHat 
3.0.3 (that's in 1996).  Also I'm not the owner of the rpms I want to rebuild 
so it's not a matter of me rebuilding my rpms it's also a matter of me figuring 
out how to rebuild others' rpms.  I could certainly suggest the upstream 
packagers make changes to their packages but that's a longer term discussion 
and it requires them doing this explicitly. The nature of this issue I have 
created is have rpm do this for us automatically, so I don't have to figure out 
the details myself.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/2590#issuecomment-1667348812
You are receiving this because you are subscribed to this thread.

Message ID: ___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] Make rpm builds more reproducible (Issue #2590)

2023-08-06 Thread Simon J Mudd
In the end **as a first step** I'd like to be able to reproduce the builds as 
close to what the original builder did.  So I'd like that build to be 
reproducible in the way referenced in the URL.

- One thing is to record the rpm macro values used during the build process.
- another simple thing would be to record the full list of name / version of 
rpms installed on the build system.

"rpm builds" are a bit of a pain as often we may use multiple repos. The build 
environment does not explicitly say where the `BuildRequires:` packages come 
from, so `rpm` itself is unable to pull down the needed packages in order to 
trigger the builds.  `yum` or `dnf` could do that but it `rpm` doesn't know 
about this and the lack of integration between the two sets of tools has always 
somewhat surprised me though I guess that's a different story and it's water 
under the bridge now.

I have changed the issue title to **make rpm builds more reproducible** as I 
think that's what I want and I think that by solving the 2 points above this 
would help a lot.

I'm also aware that if you run `rpmbuild -bs .spec` there's no binary build 
process going on so I'd guess it's quite possible that macro definitions and 
installed package lists may not actually be useful.  However, if you build 
source and binary packages together, which I'd assume is the correct thing to 
do, e.g. `rpmbuild -ba  whatever.spec` then 
you will have the information needed and could save this inside the `src.rpm`.

I think this would pretty much help solve my problem.

My second desire, **which is NOT relevant to this issue**, is to then be able 
to modify the build config or sources or add patches so that the finally built 
packages provide extra features compared to the originally built packaging, yet 
if done correctly the result will be compatible with the original packages.  
You could think of this along the lines of building extra kernel modules in a 
separate rpm provided as part of the build process which can be installed and 
used on the upstream running base kernel.  It's much easier to do this if you 
can reproduce the upstream build process first.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/2590#issuecomment-1667282383
You are receiving this because you are subscribed to this thread.

Message ID: ___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] Make rpm rebuilds more reproducible (Issue #2590)

2023-08-04 Thread Simon J Mudd
I think you misunderstand. If you build for multiple OS versions. e.g. RHEL 
7..9 then dependencies are different, so you can't explicitly as rpm works at 
the moment configure all the dependencies at the same time. This is done by 
adding additional macro processing to either be told the OS to build for .e.g. 
Oracle community MySQL rpms tend to use `--define 'rh7 1'` or `--define 'rh8 
1'` or `--define 'rh9 1'` to provide the "hints" to build the package. This 
can't be done during the build process as `BuildRequires:` or `Requires:` tags 
are "static" by this time (evaluated prior to the build by `rpmbuild`.

Yet to be reproducible you need to know which macros were provided by the user 
or at least are "not part of the base rpm/build setup" and therefore are "user 
configurable". Without that you may not be able to determine exactly the same 
"input parameters" to re-build a package in the same was as the original 
packager. Similarly if I want to patch this package with different/extra 
functionality I can not be sure that my build will represent a consistent 
change against the original sources, and thus the whole premise of repeatable 
builds falls apart.

My question therefore was whether there are any plans to ensure that a built 
src rpm could be configured to include any rpmbuild time command line macros as 
metadata so that I can use that in theory to reproduce the build process 
faithfully.

Reason for this coming up. I'm having problems reproducing the builds because 
it looks like the original build environment used by the original packagers  
may not be "pristine" to fix one specific build issue I had to build add some 
symlinks , e.g. 
https://github.com/sjmudd/mysql-rpm-builder/blob/main/config/prepare__centos.8__8.0.33.sh#L44-L50
 to setup the OS in a way which the build process would complete without 
errors.  That's clearly a very obscure example but it's real.

So I'm looking at ways to be make it easier for the rpm packaging to be 
configured in such a way that issues such as this can be avoided and having a 
"suitable spec file" I really can repeatably build from scratch successfully.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/2590#issuecomment-1665499416
You are receiving this because you are subscribed to this thread.

Message ID: ___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


[Rpm-maint] [rpm-software-management/rpm] Make rpm rebuilds more reproducible (Issue #2590)

2023-07-29 Thread Simon J Mudd
I was looking at rebuilding some MySQL community rpms which are normally built 
by Oracle but doing this turns out to be surprisingly hard.  The spec file, 
https://github.com/mysql/mysql-server/blob/trunk/packaging/rpm-oel/mysql.spec.in,
 uses a number of macros to defined various parts of the build process, 
`BuildRequires:` entries and so on depending on the OS being used. This works 
for RHEL 7..9 but of course should also work for CentOS 7..9 and other similar 
distros.

However, to reproduce a build made by someone else you need to know the exact 
macro definitions when the rpms were built.  Unless I'm mistaken if you build a 
package with `rpmbuild --define 'something 1' --define 'something_else 1' 
name.spec` then the actual command line arguments used to build the package are 
not explicitly recorded in the binary rpms but perhaps more importantly in the 
src.rpm which I think only contains the sources and the spec file used.

If that's the case the lack of recording this information means that from a src 
rpm I may be unable to rebuild the binary rpms in the same way as the original 
packager.  Is this assumption correct?  If so would it make sense that the 
.src.rpm also included the command line defines (and anything else that might 
make sense) to simplify this task?

I also notice that building any software depends on the installed software on 
the host/container where the build process runs, yet this is also not 
"registered". For rpm systems it might be convenient to also record the 
installed rpm package list as that would also be useful for reproducing the 
build environment appropriately.

Outside of rpm itself is repo configuration which is OS dependent and it seems 
that RHEL/CentOS/OEL and the other RH-clones all do things slightly differently 
which makes rebuilds more complex.  I guess that's outside of the scope of this 
issue.

Why would improving this be useful if the source is provided? Simply because I 
may want to patch the originally built rpms in a specific way yet be sure that 
the rest of the build and packaging process is as close to the original 
packaging as before.

Alternatively I may want to build a sub-module of the upstream packages which 
is compatible with the originally built packages and can be used without having 
to rebuild the whole upstream code again.

So far I've not seen a way to make this process simpler and think the 
suggestions above, to include more information on the build command line 
arguments (and maybe macro values) and the installed package list, would help 
the rebuild process.

Can something be done in this direction?

For context I created: https://github.com/sjmudd/mysql-rpm-builder/ which was 
an attempt to simplify / document the reproducible rebuild process and it has 
turned out to be harder than originally anticipated.  It is still work in 
progress but maybe gives some context to where the question comes from.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/2590
You are receiving this because you are subscribed to this thread.

Message ID: ___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint