Re: Reproducing the release tarballs

2024-04-01 Thread Dan Fandrich via curl-library
On Sun, Mar 31, 2024 at 11:24:27AM +0200, Daniel Stenberg wrote:
> On Sat, 30 Mar 2024, Dan Fandrich via curl-library wrote:
> 
> > SPDX seems to be the standard SBOM format for this that tools are
> > starting to expect.  The format is able to handle complex situations,
> > but given the very limited scope needed in curl and for source releases
> > only, once you get a template file set up the first time filling in the
> > details for every release should be simple.
> 
> I can't but to feel that this is aiming (much) higher than what I want to
> do. If someone truly thinks SPDX is a better way to provide this information
> then I hope someone will step up and convert the scripts to instead use this
> format.
> 
> This is a SBOM for the tarball creation, not for curl.

Well, what is the tarball but the tarball of "curl"?  SPDX can provide
information on the files in the tarball as well as the files used to create the
tarball. How much you provide is up to you, but the more information available,
the more possibilities there are for others to use it.

> I rather start with something basic and simple, as we don't even know if
> anyone cares or wants this information.

That makes sense. SPDX is definitely heavier weight than a few version numbers
in an .md file. But, a lot more useful, too.

> > Even running "reuse spdx" in the curl tree (the same tool that's keeping
> > curl in REUSE compliance in that CI build) will output a SPDX file for
> > curl.
> 
> I tried it just now. It produces 86,000 lines of output! And yet I can't
> find a lot of helpful content within the output for our purpose here.

That example was just the first one I thought of that you might already have on
your system (due to the work in getting REUSE compliance some time ago). It
doesn't solve the problem at hand, but it shows what SPDX looks like and it
could still be integrated into a final curl SPDX file provided with each
release if we wanted it to. Few projects provide SPDX files right now which is 
why
companies using SPDX only for license compatibility checking need to run "reuse
spdx" on the source code themselves. But if curl provided that SPDX file already
filled in with each release, including the additional information on the
dependencies used to create the tar ball itself, that single file can serve two
purposes.  Even more purposes, actually, since it could be additionally be used
for security scanning, such as finding that curl used a back-door autoconf m4
macro found only in the tarball (if that ends up happening one day).

> It does not seem like a suitable tool for this.

Agreed. It just gives a flavour of one of the kinds of things a SPDX file can
provide, but could become part of a solution.

A tool that might actually do what you want is
https://pypi.org/project/distro2sbom/  That creates a SPDX file listing all the
packages in the current system (e.g. Debian packages on Debian).  You probably
don't want to run that on your personal system (way too many irrelevant
packages), but it could be run from a minimal container used just to create a
tarball to provide a more easily reproducible set of packages for others to
fall on want to completely reproduce that build process.

Dan
-- 
Unsubscribe: https://lists.haxx.se/mailman/listinfo/curl-library
Etiquette:   https://curl.se/mail/etiquette.html


Re: Reproducing the release tarballs

2024-03-31 Thread Daniel Stenberg via curl-library

On Sat, 30 Mar 2024, Dan Fandrich via curl-library wrote:

SPDX seems to be the standard SBOM format for this that tools are starting 
to expect.  The format is able to handle complex situations, but given the 
very limited scope needed in curl and for source releases only, once you get 
a template file set up the first time filling in the details for every 
release should be simple.


I can't but to feel that this is aiming (much) higher than what I want to do. 
If someone truly thinks SPDX is a better way to provide this information then 
I hope someone will step up and convert the scripts to instead use this 
format.


This is a SBOM for the tarball creation, not for curl.

I rather start with something basic and simple, as we don't even know if 
anyone cares or wants this information.


Even running "reuse spdx" in the curl tree (the same tool that's keeping 
curl in REUSE compliance in that CI build) will output a SPDX file for curl.


I tried it just now. It produces 86,000 lines of output! And yet I can't find 
a lot of helpful content within the output for our purpose here.


It does not seem like a suitable tool for this.

--

 / daniel.haxx.se
 | Commercial curl support up to 24x7 is available!
 | Private help, bug fixes, support, ports, new features
 | https://curl.se/support.html
--
Unsubscribe: https://lists.haxx.se/mailman/listinfo/curl-library
Etiquette:   https://curl.se/mail/etiquette.html


Re: Reproducing the release tarballs

2024-03-30 Thread Dan Fandrich via curl-library
On Sat, Mar 30, 2024 at 06:29:48PM +0100, Daniel Stenberg via curl-library 
wrote:
> Any proposals for how to document the exact set of tools+versions I use for
> each release in case someone in the future wants to reproduce an ancient
> release tarball?

SPDX seems to be the standard SBOM format for this that tools are starting to
expect.  The format is able to handle complex situations, but given the very
limited scope needed in curl and for source releases only, once you get a
template file set up the first time filling in the details for every release
should be simple.

The spec is at https://spdx.dev/use/specifications/ but it's probably easier to
look at some simple examples to get a feel for it. Even running "reuse spdx" in
the curl tree (the same tool that's keeping curl in REUSE compliance in that CI
build) will output a SPDX file for curl. That one doesn't include the source
build dependencies that your interested in (because that's not what that
particular tool does) but could be a start of something. The curl SBOM could
also include Debian package names+versions as dependencies.

Dan
-- 
Unsubscribe: https://lists.haxx.se/mailman/listinfo/curl-library
Etiquette:   https://curl.se/mail/etiquette.html


Re: Reproducing the release tarballs

2024-03-30 Thread Daniel Stenberg via curl-library

On Sat, 30 Mar 2024, Jeffrey Walton wrote:


If I am not mistaken, you usually take the Autools gear that is
provided by the distro. There's no need to chase m4 files.


I'm talking about these m4 files:

$ ls -l m4/*m4 | wc -l
28

They are our custom autoconf functions.


However, you should download the latest config.sub and config.guess to
package with your tarball


No need. They are generated by autoreconf.

--

 / daniel.haxx.se
 | Commercial curl support up to 24x7 is available!
 | Private help, bug fixes, support, ports, new features
 | https://curl.se/support.html
--
Unsubscribe: https://lists.haxx.se/mailman/listinfo/curl-library
Etiquette:   https://curl.se/mail/etiquette.html


Re: Reproducing the release tarballs

2024-03-30 Thread Daniel Stenberg via curl-library

On Sat, 30 Mar 2024, Daniel Stenberg via curl-library wrote:

For the most recent curl release, my toolset that I believe might affect the 
results include:


Since I do all releases on Debian Linux and they occasionally apply patches 
that make them deviate from the upstream versions, it was pointed out to me 
that listing the exact Debian package name is probably better and more 
accurate.


Let's generate a docs/RELEASE-TOOLs.md file into the tarball with the details:

  https://github.com/curl/curl/pull/13239

The current file the script generates for me looks like this:

--- snip ---
# Release tools

The following tools and their Debian package version numbers were used to
produce this release tarball.

- autoconf: 2.71-3
- automake: 1:1.16.5-1.3
- libtool: 2.4.7-7
- make: 4.3-4.1
- perl: 5.38.2-3.2
- git: 1:2.43.0-1+b1

# Reproduce the tarball

- Clone the repo and checkout the release tag
- Install the same set of tools + versions as listed above

## Do a standard build

- autoreconf -fi
- ./configure [...]
- make

## Generate the tarball

- ./maketgz [version]

-- snip ---


--

 / daniel.haxx.se
 | Commercial curl support up to 24x7 is available!
 | Private help, bug fixes, support, ports, new features
 | https://curl.se/support.html
--
Unsubscribe: https://lists.haxx.se/mailman/listinfo/curl-library
Etiquette:   https://curl.se/mail/etiquette.html


Re: Reproducing the release tarballs

2024-03-30 Thread Rod Widdowson via curl-library
I usually checkin  the gpg signatures of the downloaded artifacts (which I have 
checked against verified keys) and make that part of the tag.
-- 
Unsubscribe: https://lists.haxx.se/mailman/listinfo/curl-library
Etiquette:   https://curl.se/mail/etiquette.html


Re: Reproducing the release tarballs

2024-03-30 Thread Jeffrey Walton via curl-library
On Sat, Mar 30, 2024 at 2:40 PM Daniel Stenberg via curl-library
 wrote:
>
> On Sat, 30 Mar 2024, Howard Chu wrote:
>
> > IMO only project developers should ever be touching the autotools.
> ...
>
> > Only our release engineer ever generates the configure script, and it's
> > committed to the repo along with everything else.
>
> For people using releases, it does not matter since the scripts are in the
> tarballs.
>
> Even if generated files are committed, there still needs to be a system or way
> to verify that the scripts are indeed generated correctly.
>
> Since Makefile.am, configure.ac, and *m4 files are updated quite frequently,
> it would be a lot of overhead to having to commit updated scripts (by a single
> person). It does not feel like a concept I would be comfortable with.

If I am not mistaken, you usually take the Autools gear that is
provided by the distro. There's no need to chase m4 files.

However, you should download the latest config.sub and config.guess to
package with your tarball per
.
So your build script or bootstrap.sh would include something like:

if command -v wget >/dev/null 2>&1; then
FETCH_CMD="wget -q -O"
elif command -v curl >/dev/null 2>&1; then
FETCH_CMD="curl -L -s -o"
else
FETCH_CMD="curl-and-wget-not-found"
fi

IS_DARWIN=`uname -s 2>&1 | "$GREP" -i -c darwin

echo "Updating config.sub"
if ${FETCH_CMD} config.sub.new \
   'https://git.savannah.gnu.org/gitweb/?p=config.git;a=blob_plain;f=config.sub'
>/dev/null 2>&1; then

# Solaris removes +w, can't overwrite
chmod +w build-aux/config.sub
mv config.sub.new build-aux/config.sub
chmod +x build-aux/config.sub

if [ "$IS_DARWIN" -ne 0 ] && [ command -v xattr >/dev/null 2>&1 ]; then
echo "Removing config.sub quarantine"
xattr -d "com.apple.quarantine" build-aux/config.sub >/dev/null 2>&1
fi
fi

echo "Updating config.guess"
if ${FETCH_CMD} config.guess.new \
   
'https://git.savannah.gnu.org/gitweb/?p=config.git;a=blob_plain;f=config.guess'
>/dev/null 2>&1; then

# Solaris removes +w, can't overwrite
chmod +w build-aux/config.guess
mv config.guess.new build-aux/config.guess
chmod +x build-aux/config.guess

if [ "$IS_DARWIN" -ne 0 ] && [ command -v xattr >/dev/null 2>&1 ]; then
echo "Removing config.guess quarantine"
xattr -d "com.apple.quarantine" build-aux/config.guess >/dev/null 2>&1
fi
fi

Jeff
-- 
Unsubscribe: https://lists.haxx.se/mailman/listinfo/curl-library
Etiquette:   https://curl.se/mail/etiquette.html


Re: Reproducing the release tarballs

2024-03-30 Thread Daniel Stenberg via curl-library

On Sat, 30 Mar 2024, Howard Chu wrote:


IMO only project developers should ever be touching the autotools.


...

Only our release engineer ever generates the configure script, and it's 
committed to the repo along with everything else.


For people using releases, it does not matter since the scripts are in the 
tarballs.


Even if generated files are committed, there still needs to be a system or way 
to verify that the scripts are indeed generated correctly.


Since Makefile.am, configure.ac, and *m4 files are updated quite frequently, 
it would be a lot of overhead to having to commit updated scripts (by a single 
person). It does not feel like a concept I would be comfortable with.


--

 / daniel.haxx.se
 | Commercial curl support up to 24x7 is available!
 | Private help, bug fixes, support, ports, new features
 | https://curl.se/support.html
--
Unsubscribe: https://lists.haxx.se/mailman/listinfo/curl-library
Etiquette:   https://curl.se/mail/etiquette.html


Re: Reproducing the release tarballs

2024-03-30 Thread Daniel Stenberg via curl-library

On Sat, 30 Mar 2024, jim.ful...@webcomposite.com wrote:

While we are here … can we outline all processes to tarball - for example I 
see no signing step


I did not mention signing because it does not strictly affect the tarball as 
the signature is separate. I gpg sign every release and have done so for more 
than a decade.


- also wonder if we need to consider signing tarballs (and all release 
artefacts) using cosign ?


What benefits would that bring?

--

 / daniel.haxx.se
 | Commercial curl support up to 24x7 is available!
 | Private help, bug fixes, support, ports, new features
 | https://curl.se/support.html-- 
Unsubscribe: https://lists.haxx.se/mailman/listinfo/curl-library
Etiquette:   https://curl.se/mail/etiquette.html


Re: Reproducing the release tarballs

2024-03-30 Thread Howard Chu via curl-library
Daniel Stenberg via curl-library wrote:
> Hello,
> 
> In the light of the xz attack, I would like to mention that in order to 
> reproduce the tarballs I upload for curl release, this is necessary:
> 
> - Clone the repo and checkout the release tag
> 
> - Install the same set of tools + versions I use
> 
> - run "./maketgz [version]"
> 
> For the most recent curl release, my toolset that I believe might affect the 
> results include:
> 
> - autoconf (GNU Autoconf) 2.71
> - automake (GNU automake) 1.16.5
> - libtoolize (GNU libtool) 2.4.7
> - GNU Make 4.3
> - perl v5.38.2
> - git version 2.43.0
> 
> (make, perl and git most probably have very little effect but I figure 
> including them in the list could be worth it since they are invoked in the 
> release process)
> 
> Any proposals for how to document the exact set of tools+versions I use for 
> each release in case someone in the future wants to reproduce an ancient 
> release
> tarball?
> 
IMO only project developers should ever be touching the autotools. Source code 
tarballs should have a correctly
generated configure script included, and of course those scripts list the 
version number of the autoconf script
that generated them. This is how it works in OpenLDAP, anyway. Only our release 
engineer ever generates the
configure script, and it's committed to the repo along with everything else.

-- 
  -- Howard Chu
  CTO, Symas Corp.   http://www.symas.com
  Director, Highland Sun http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/
-- 
Unsubscribe: https://lists.haxx.se/mailman/listinfo/curl-library
Etiquette:   https://curl.se/mail/etiquette.html


Reproducing the release tarballs

2024-03-30 Thread Daniel Stenberg via curl-library

Hello,

In the light of the xz attack, I would like to mention that in order to 
reproduce the tarballs I upload for curl release, this is necessary:


- Clone the repo and checkout the release tag

- Install the same set of tools + versions I use

- run "./maketgz [version]"

For the most recent curl release, my toolset that I believe might affect the 
results include:


- autoconf (GNU Autoconf) 2.71
- automake (GNU automake) 1.16.5
- libtoolize (GNU libtool) 2.4.7
- GNU Make 4.3
- perl v5.38.2
- git version 2.43.0

(make, perl and git most probably have very little effect but I figure 
including them in the list could be worth it since they are invoked in the 
release process)


Any proposals for how to document the exact set of tools+versions I use for 
each release in case someone in the future wants to reproduce an ancient 
release tarball?


--

 / daniel.haxx.se
 | Commercial curl support up to 24x7 is available!
 | Private help, bug fixes, support, ports, new features
 | https://curl.se/support.html
--
Unsubscribe: https://lists.haxx.se/mailman/listinfo/curl-library
Etiquette:   https://curl.se/mail/etiquette.html