Source: gnupg2 Severity: minor X-Debbugs-Cc: Daniel Kahn Gillmor <d...@fifthhorseman.net>
The gnupg2 package is built from source based on the upstream released tarball. Upstream also uses git for revision control, and we track upstream git as well as the released tarballs. upstream uses OpenPGP to sign both git tags and released tarballs. We trim many prebuilt files from the tarball, so what's in our debian packaging repositories are pretty close to upstream's git repos. But not quite all of them. Inspired by the recent xz mess, where malicious files were slipped into a tarball, i'd like to minimize the amount of non-tracked source used in GnuPG. I think we should use debian/clean (and gbp import-orig's filtering, see #1071200) to trim out all of the generated files before build, so that what we're building from source is as close to upstream traceable git commits as possible. I did a quick scan of what we're shipping in revision control (hence, what's in the filtered tarball) that the upstream git tag isn't accounting for. Here's what i found: $ git diff --stat gnupg-2.2.43..upstream/2.2.43 | grep -e '\+' -e 'Bin 0 ->' ChangeLog | 34710 ++++++++++++++++++- VERSION | 1 + common/audit-events.h | 116 + common/status-codes.h | 248 + doc/defsincdate | 1 + doc/gnupg-card-architecture.pdf | Bin 0 -> 19221 bytes doc/gnupg-card-architecture.png | Bin 0 -> 8843 bytes doc/gnupg-module-overview.pdf | 408 + doc/gnupg-module-overview.png | Bin 0 -> 124560 bytes po/ca.po | 2295 +- po/cs.po | 2303 +- po/da.po | 2299 +- po/de.po | 2310 +- po/el.po | 2295 +- po/e...@boldquot.po | 10967 ++++++ po/e...@quot.po | 10951 ++++++ po/eo.po | 2295 +- po/es.po | 2307 +- po/et.po | 2299 +- po/fi.po | 2295 +- po/fr.po | 2299 +- po/gl.po | 2303 +- po/gnupg2.pot | 10636 ++++++ po/hu.po | 2295 +- po/id.po | 2295 +- po/it.po | 2295 +- po/ja.po | 2295 +- po/nb.po | 2295 +- po/pl.po | 2295 +- po/pt.po | 2295 +- po/ro.po | 2307 +- po/ru.po | 2303 +- po/sk.po | 2303 +- po/sv.po | 2299 +- po/tr.po | 2295 +- po/uk.po | 2299 +- po/zh_CN.po | 2295 +- po/zh_TW.po | 2291 +- regexp/_unicode_mapping.c | 284 + 242 files changed, 127919 insertions(+), 42329 deletions(-) $ the doc/*.{pdf,png} stuff is fixed already, as of 2.2.43-3, and will be filtered out whenever we move to the next upstream release. Here's my attempt at analyzing what remains: ChangeLog: this is generated automatically by upstream from upstream git history, and we ship it (half a meg after compression!) in all the produced packages. This seems like a lot, and we ought to be able to drop it from nearly everywhere. what if we just shipped it with gnupg2-doc, and left the other packages with a simple text file? or What if we just stopped shipping it altogether? will anyone mind? The details are at developer-level, and it'll still be in the source tarballs if anyone wants to read the file. VERSION: this contains only the upstream version number. Can we generate it manually from debian/changelog? doc/defsincdate: this file is generated upstream, and can potentially introduce non-reproducibility (see debian/patches/debian-packaging/avoid-regenerating-defsincdate-use-shipped-file.patch for more discussion). If we strip that file, and drop the above patch (or tune it so that it only works with $SOURCE_DATE_EPOCH) then we should be able to avoid unreproducibility. Doing so would mean that generated documentation files would have the timestamp of the changelog entry, though, rather than the timestamp of the upstream tarball. that might make (for example) a diffoscope comparison of shipped files between point releases unnecessarily noisy. common/{audit-events,status-codes}.h: these appear to be stripped and rebuilt in maintainer-mode. we're currently building (at least one of our builds) in maintainer-mode, so it seems like we ought to be able to strip them and ensure that they get rebuilt, but i haven't tested. regexp/_unicode_mapping.c: this is another maintainer-mode file, generated from UnicodeData.txt. Looks like it contains a mapping between upper and lower case codepoints. Debian ships a more up-to-date UnicodeData.txt in the unicode-data package, which includes some codepoints (like GLAGOLITIC CAPITAL LETTER CAUDATE CHRIVI and GLAGOLITIC SMALL LETTER CAUDATE CHRIVI) that are paired casewise, but are not represented in this file. Maybe the right (and more up-to-date) solution is to build-depend on unicode-data, strip both this file and UnicodeData.txt in debian/clean, and patch to generate this file from /usr/share/unicode/UnicodeData.txt instead. I'm not sure what to do about the po/??.po files. they appear to all be modified/annotated (adding source code file and line number annotations) by upstream during "make dist" (when the tarball is created), and then our build process re-annotates them. Seems like it would be nicer to work with the unannotated files, because then we could apply patches that are simpler to port from version to version. I also don't fully understand the l10n mechanism used here: if po/e...@boldquot.po, po/e...@quot.po, and po/gnupg2.pot are generated during "make dist", it seems like we ought to be able to generate them ourselves directly, but i haven't tested. Happy to hear any suggestions about the right way forward to bring GnuPG in debian more in line with upstream's revision control, to reduce the amount of slippage that can be introduced in a tarball. If we could somehow prune to a state where we are building from (a subset of) the intersection of the upstream git tag and the released tarball, that would give us something concrete to automatically check on each version upgrade. --dkg -- System Information: Debian Release: trixie/sid APT prefers testing-debug APT policy: (500, 'testing-debug'), (500, 'testing'), (500, 'stable'), (500, 'oldstable'), (200, 'unstable-debug'), (200, 'unstable'), (1, 'experimental-debug'), (1, 'experimental') Architecture: amd64 (x86_64) Kernel: Linux 6.7.12-amd64 (SMP w/4 CPU threads; PREEMPT) Kernel taint flags: TAINT_FIRMWARE_WORKAROUND Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE not set Shell: /bin/sh linked to /usr/bin/dash Init: systemd (via /run/systemd/system) -- no debconf information
signature.asc
Description: PGP signature