I made some comments about this in a tag2upload contributors BoF at DebConf which contained details new to Ian, and they asked me to write those details down here.

While it's possible to use pristine-tar to commit deltas for tarballs that are not represented by any git commit, the typical workflows that I believe most Debian developers use take care to avoid that situation. The ones I'm familiar with are git-buildpackage and git-dpm. In both cases, the typical practice is to use "gbp import-orig" or "git-dpm import-new-upstream" respectively to import a new upstream tarball into git and merge it into the packaging branch. Among other things, these tools construct a commit on the upstream branch (which may be called "upstream" or some minor variation of that) with a tree whose contents are identical to those of the tarball. That commit may or may not have the corresponding upstream commit as an additional parent (via "gbp import-orig --upstream-vcs-tag" or "git-dpm import-new-upstream --parent". The commit corresponding to the upstream tarball is then merged onto the packaging branch in some way.

My layperson understanding of pristine-tar is that "pristine-tar commit" (which may be called directly, or via "gbp import-orig --pristine-tar" or "git-dpm import-new-upstream --pristine-tar-commit") constructs a binary delta expressing the differences between its canonicalized compression of "git archive" and the target tarball, and commits that delta to a branch called "pristine-tar". If its input parameters include an upstream commit that doesn't correspond exactly to the target tarball, then it's true that you might end up with tree contents that doesn't really live anywhere else other than the delta. However, modulo bugs or weird edge cases (perhaps involving .gitattributes), this only happens if somebody has called "pristine-tar commit" directly on a mismatching commit; these higher-level tools won't do it.

https://salsa.debian.org/auth-team/libfido2 and https://salsa.debian.org/debian/libpipeline are examples I'm familiar with from each of those tools. In each case you'll find an "upstream" branch that should be identical to the corresponding unpacked upstream tarball. In the libfido2 case, there's no true upstream git history and the "upstream" branch is just a sequence of tarball imports. In the libpipeline case, the "upstream" branch has the corresponding upstream commits as additional parents, so you can see everything clearly in git history: e.g. "git diff upstream^2 upstream" shows you the differences between upstream git and the tarball, while "git diff upstream~ upstream" shows you the differences between successive upstream tarballs. While these packages use git-buildpackage and git-dpm respectively, you can use both modes (with or without an additional parent) with either tool.

I hope this is helpful. Let me know if you need any extra help getting your heads around pristine-tar and the associated workflows; I'm not really an implementation expert, but I'm a proficient user and can probably help to bridge any remaining gaps.

--
Colin Watson (he/him)                              [[email protected]]

Reply via email to