On Wed, Apr 06, 2022 at 02:15:02AM +0200, Jason A. Donenfeld wrote:
> 2) Comparability: other distros use SHA2-512, as well as various
> upstreams, which means we can compare our hashes to theirs easily.
Can we expand on this specific thread for a moment?

I was the author of GLEP59 about changing the Manifest hashes, and I
noted at the time, with references, that the effective strength of a set
of hashes is only that of the strongest hash.

One of my regrets from GLEP59 is that it's made it harder for use cases
outside of the normal user distfile workflow.

The use case that impacted me the most was being able to compare our
distfiles were over time vs external sources, esp. if the file goes
missing or was fetch-restricted and we can't produce a new hash of it.
Maybe upstream only ever published SHA1/SHA256, and we only ever
calculated SHA512/BLAKE2b on the file. Since we never had hashes from
both sides at the same time, we cannot prove it was the same file.

We need to be able to ship one or more hashes to users, for the specific
use case of validating the distfiles they download.

As a developer, I'd like to be able to track the other hashes for a
file, without forcing ourselves to retain the file. This might be to
compare with upstream published hashes, or to compare with other
distros.

In fact it would be really nice to have a semi-automated pipeline to
plug in signed upstream hashes to our Manifests, and make it possibly to
prove our new SHA512/BLAKE2B hash was taken over the correct input in
the first place, and there wasn't any subtle supply-chain attack early
in the packaging process.

Where would those hashes go? They don't need to be in the Manifest, or
at the very least they don't need to be distributed via rsync to users
(it only costs a small amount of bytes to do so).

Where else could they go? 
- Commit messages could work.
- Git notes to a lesser degree.
- alternate repos?

> A reason why some people might prefer BLAKE2b over SHA2-512 is a
> performance improvement. However, seeing as right now we're opening
> the file, reading it, computing BLAKE2b, closing the file, opening the
> file again, reading it again, computing SHA2-512, closing the file, I
> don't think performance is actually something people care about. Seen
> differently, removing either one of them will already give us a
> performance "boost" or sorts.
Or just only verifying the "strongest" hash gives you that boost.

I do want to check into the code that you pointed out, because I'm
really sure much older versions of Portage did the CORRECT thing of only
reading the file in a single pass.

-- 
Robin Hugh Johnson
Gentoo Linux: Dev, Infra Lead, Foundation Treasurer
E-Mail   : robb...@gentoo.org
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136

Attachment: signature.asc
Description: PGP signature

Reply via email to