[arch-dev-public] Reproducible Builds July Update

2020-07-09 Thread Jelle van der Waa
Hi All,

A lot of work has been put into getting Arch packages 100% reproducible,
the [community] repository has been added to the rebuilderd instance on
reproducible.archlinux.org. As of now [community] is 60% reproducible
and packages with file ordering issues have been rebuild and are about
to be moved to [community] which will hopefully increase the
reproducible package percentage!

Archweb now imports the rebuilderd status into it's database so, you are
now able to your reproducible package issues. [1]

In your Archweb profile, it's possible to get an email notifications
when packages change from reproducible to unreproducible. [2]

As of now, the software which rebuilds our packages store no logs/diffs
of the unreproducible packages. To get a diff:

1) Retrieve the package
2) run "repro -d $pkg"

Or when a package is freshly build:

$ repro -ndf $pkg

To rebuild the newly build package.

For some package groups there are "structural" issues such as KDE
packages, packages with JAR files and Haskell packages see the wiki for
those issues. [3]

For Haskell packages, a filesystem/perl bug has to resolved. [4]

[1]
https://www.archlinux.org/devel/reports/non-reproducible-packages/$username/
[2] https://www.archlinux.org/devel/profile/
[3] https://wiki.archlinux.org/index.php/Reproducible_Builds/Status
[4] https://wiki.archlinux.org/index.php/Reproducible_Builds/Status

Greetings,

Jelle



signature.asc
Description: OpenPGP digital signature


Re: [arch-dev-public] Use detached package signatures by default

2020-07-09 Thread Anatol Pomozov via arch-dev-public
Hi Jelle

On Thu, Jul 9, 2020 at 2:00 AM Jelle van der Waa  wrote:
>
> On 09/07/2020 05:05, Anatol Pomozov via arch-dev-public wrote:
> > TLDR; let’s start using detached package signatures to make system
> > updates faster.
> >
> > Hi folks,
> >
> > Some time ago there was a discussion at IRC where someone (Allan
> > maybe?) proposed to stop using embedded PGP signatures in favor of
> > detached signature files. I would like to bring this idea here and
> > quantify it with some numbers.
>
> The downside of not having the package signatures in the database is
> that consumers can not easily obtain this information. For archweb
> that's showing who signed the package on the package details page.
>
> How would I implement an efficient alternative without fetching package
> files or all the sig files? A separate sig database? :P

The best option is to download and parse the signature file directly.
Its filename is going to be .sig where 
is available in a package description as %FILENAME% entry.

> As far now I'll have to adjust the code not to break because of a
> missing PGPSIG entry.
>
> > Here is a bit of technical details on this topic. Pacman has the
> > ability to verify authenticity of package files with PGP signatures.
> > PGP signatures add protection against undesired package modifications
> > by a third-party and it improves security aspects of the package
> > management. This feature can be configured per repository and the
> > official Arch Linux repos have it enabled. Package signatures have
> > been used by Arch Linux successfully for a couple of years now.
>
> 
>
> > An alternative to embedded signatures are detached signatures. These
> > are signatures stored in a separate file next to the package itself
> > (in a .sig file to be specific). Instead of downloading *all*
> > signatures every time a database is updated, detached signatures are
> > downloaded only when a specific package is installed/updated. If Arch
> > could switch to this model then database files become 3 times smaller
> > that saves users bandwidth and system update time.
>
> It would be insightful to provide the database numbers, because one
> could argue 30% of 1MB is nothing, as 30% of 100M is nice improvement.
>
> Our biggest database should be community (5M atm), and with all the
> savings that would now be ~ 2 MB? Would be nice to have an overview of
> the real life numbers :)

For compressed "community" database the savings are going to be
5.2M -> 1.73M (gzip) or 1.26M (zstd -19). With other dbs I would
say that for an average user we are looking at 7M->2.2M total savings
in the database size.

Keep in mind that database downloading/parsing is located at the critical
path. Every user downloads these db files pretty much
every time "pacman -Sy" is run. Detached signatures make this step
faster by reducing the workload and downloading signatures on-demand later.


Re: [arch-dev-public] Use detached package signatures by default

2020-07-09 Thread Jelle van der Waa
On 09/07/2020 05:05, Anatol Pomozov via arch-dev-public wrote:
> TLDR; let’s start using detached package signatures to make system
> updates faster.
> 
> Hi folks,
> 
> Some time ago there was a discussion at IRC where someone (Allan
> maybe?) proposed to stop using embedded PGP signatures in favor of
> detached signature files. I would like to bring this idea here and
> quantify it with some numbers.

The downside of not having the package signatures in the database is
that consumers can not easily obtain this information. For archweb
that's showing who signed the package on the package details page.

How would I implement an efficient alternative without fetching package
files or all the sig files? A separate sig database? :P

As far now I'll have to adjust the code not to break because of a
missing PGPSIG entry.

> Here is a bit of technical details on this topic. Pacman has the
> ability to verify authenticity of package files with PGP signatures.
> PGP signatures add protection against undesired package modifications
> by a third-party and it improves security aspects of the package
> management. This feature can be configured per repository and the
> official Arch Linux repos have it enabled. Package signatures have
> been used by Arch Linux successfully for a couple of years now.



> An alternative to embedded signatures are detached signatures. These
> are signatures stored in a separate file next to the package itself
> (in a .sig file to be specific). Instead of downloading *all*
> signatures every time a database is updated, detached signatures are
> downloaded only when a specific package is installed/updated. If Arch
> could switch to this model then database files become 3 times smaller
> that saves users bandwidth and system update time.

It would be insightful to provide the database numbers, because one
could argue 30% of 1MB is nothing, as 30% of 100M is nice improvement.

Our biggest database should be community (5M atm), and with all the
savings that would now be ~ 2 MB? Would be nice to have an overview of
the real life numbers :)

Greetings,

Jelle van der Waa




signature.asc
Description: OpenPGP digital signature