On 31.03.24 01:07, Elliott Mitchell wrote:
On Sat, Mar 30, 2024 at 03:30:49PM +0000, Daniel Golle wrote:

unchanged. Git has a lot of security built-in, and by using tarballs
as a base for our package builds we are basically throwing all that
away, for the sake of saving a negligible amount of resources on
the build infrastructure.

I sort of agree, sort of disagree with this.  Having a cryptographic hash
at the center of everything provides security comparable to the security
of the hash.  Alas, this means replacing that hash is a bit difficult.

The design is good, but SHA-1 is no longer appropriately secure.
Replacing SHA-1 is a work in progress, but until that completes SHA-1 is
still the core of *everything*.  I've been monitoring the situation and
early work started in 2017, but it still isn't usable yet.  Until it is
ready there is this rather oversize elephant in the room.

https://git-scm.com/docs/hash-function-transition

(SHA-1 collisions aren't known to have been used for anything /yet/,
but it is only a matter of time; this *really* worries me)

Assuming that generating SHA1 collisions become much easier to create, what attack scenarios are you worried about?

On Sat, Mar 30, 2024 at 03:30:49PM +0000, Daniel Golle wrote:

However, after reading up about the details of this backdoored release
tarball, I believe that the current tendency to use tarballs rather
than (reproducible!) git checkouts is also problematic to begin with.

Stuff like 'make dist' seems like a weird relic nowadays, creates more
problems than it could potentially solve, bandwidth is ubiquitous, and
we already got our own tarball mirror of git checkouts done by the
buildbots (see PKG_MIRROR_HASH). So why not **always** use that
instead of potentially shady and hard to verify tarballs?

I think Daniel's proposal is a very good idea. It reduces points of failure while adding very little cost in terms of dev/maintainer resources.

I don't think the issue is so much that tarballs are archiac, but that
*everyone* is using Git now.  One proposed patch from a pull:

https://github.com/openwrt/openwrt/pull/14280/commits/1b29aadbbf07cb77498a0eb92fe7c171c65dab2e

I don't see a single reference to a version control system besides Git
anywhere in OpenWRT at this point.  Tarballs were a reasonable choice
when there were >4 source code handling systems in use, yet now Git is
also a common point.  So if everything is in Git, how does handling
tarballs help builds?

Several ways:
- significantly faster download compared to git clone
- relying less on SHA1, since we use SHA256 for the tarballs
- proper mirror support for more reliable builds
- makes it easier to create a tarball of a specific OpenWrt version which does not need to download any extra files

Also, if SHA1-collision based repository manipulation ever becomes practical, failure to deterministically reproduce our tarballs can make it more visible.

Always using git checkouts instead of tarballs would also makes it
much easier for maintainers to at least have a quick look at the
changes made in an upstream project between versions (a quick scroll
over  'git diff oldtag..newtag' or even just 'git log --stat
oldtag..newtag' doesn't take much more time than manually validating a
release tarball GPG signature in most cases, if there even is any...).

I see several issues with your argument, but I mostly agree with your
conclusion.  Git is *everywhere*, so why use tarballs?

I disagree with your approach though.  Git already has two tools for
handling this situation and I think one of them should be chosen.

The first is `git submodule`.  My understanding it is pretty similar to
OpenWRT's current approach.  Difference is this lets `git` handle
downloading other repositories instead of doing it in a Makefile.  Since
Git is already designed to handle this sort of task, I suspect this will
be rather more reliable than the existing system.

Second is `git subtree`.  This is a tool for including other projects
into a repository.  The end result is the other project's history becomes
merged into local history.  One advantage is you download everything all
at *once*, rather than individually grabbing tools.  Other is their full
history will make upgrades easier since differences will be more obvious.

These will need major changes to the build system.
Here are several downsides to this approach:

- significantly slower downloads
- no download mirror support
- if somebody uses submodules in a downstream fork, they can't easily rely on git submodule update anymore without having to pull in tons of potentially unused stuff

What are the benefits we would be getting from all this rework churn?

- Felix

_______________________________________________
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel

Reply via email to