The idea of package deltas just won't go away... However, binary diffs
really are not ideal with pacman verifying the compressed package - that
means we need to reconstruct the package on the users system to verify.
Also our old approach using xdelta3 somewhat died when moving packages
away from gz (or xz?) compression. Other binary diff approaches really
suffered the same issue. In general, I find the approach of
reconstructing the full package to be suboptimal. I also don't
particulaly want to verify uncompressed packages.
I wondered if this was a case of perfect being the enemy of good, so I
have investigated a different, very lazy approach. Instead of taking a
binary diff, we could just provide the files that have changed between
package versions. This is super easy to do as we have checksums for all
files in the mtree file. We could then extract this "diff" package
directly, and use the mtree file to adjust timestamps/permissions/etc(?)
on kept files, and it would be just like the full package had been
installed.
I ran some numbers to see if this was worth while. The results for the
last bunch of updates for bash, coreutils, qt5-base and systemd are
given here:
https://wiki.archlinux.org/title/User:Allan/Pkgdiff
On major version updates, this is approach is a waste of time. But for
minor updates bash download would average 25% of the size, coreutils
about 36% (though was ~1% for simple rebuilds!), qt5-base about 40% and
systemd 60%. Not shown but worth noting note that when Arch changes
gcc/binutils versions or updates CFLAGS etc, this can stop any binary
diff being as useful.
If we implemented using these diffs but only allowed it for updates from
the previous package version (i.e. no diffs to package (current - 2) or
earlier, or diff chaining), then this would be rather simple to
implement (at least from the pacman side...).
Any comments before I invest time implementing?
Allan