I made some comments about this in a tag2upload contributors BoF at
DebConf which contained details new to Ian, and they asked me to write
those details down here.
While it's possible to use pristine-tar to commit deltas for tarballs
that are not represented by any git commit, the typical workflows that I
believe most Debian developers use take care to avoid that situation.
The ones I'm familiar with are git-buildpackage and git-dpm. In both
cases, the typical practice is to use "gbp import-orig" or "git-dpm
import-new-upstream" respectively to import a new upstream tarball into
git and merge it into the packaging branch. Among other things, these
tools construct a commit on the upstream branch (which may be called
"upstream" or some minor variation of that) with a tree whose contents
are identical to those of the tarball. That commit may or may not have
the corresponding upstream commit as an additional parent (via "gbp
import-orig --upstream-vcs-tag" or "git-dpm import-new-upstream
--parent". The commit corresponding to the upstream tarball is then
merged onto the packaging branch in some way.
My layperson understanding of pristine-tar is that "pristine-tar commit"
(which may be called directly, or via "gbp import-orig --pristine-tar"
or "git-dpm import-new-upstream --pristine-tar-commit") constructs a
binary delta expressing the differences between its canonicalized
compression of "git archive" and the target tarball, and commits that
delta to a branch called "pristine-tar". If its input parameters
include an upstream commit that doesn't correspond exactly to the target
tarball, then it's true that you might end up with tree contents that
doesn't really live anywhere else other than the delta. However, modulo
bugs or weird edge cases (perhaps involving .gitattributes), this only
happens if somebody has called "pristine-tar commit" directly on a
mismatching commit; these higher-level tools won't do it.
https://salsa.debian.org/auth-team/libfido2 and
https://salsa.debian.org/debian/libpipeline are examples I'm familiar
with from each of those tools. In each case you'll find an "upstream"
branch that should be identical to the corresponding unpacked upstream
tarball. In the libfido2 case, there's no true upstream git history and
the "upstream" branch is just a sequence of tarball imports. In the
libpipeline case, the "upstream" branch has the corresponding upstream
commits as additional parents, so you can see everything clearly in git
history: e.g. "git diff upstream^2 upstream" shows you the differences
between upstream git and the tarball, while "git diff upstream~
upstream" shows you the differences between successive upstream
tarballs. While these packages use git-buildpackage and git-dpm
respectively, you can use both modes (with or without an additional
parent) with either tool.
I hope this is helpful. Let me know if you need any extra help getting
your heads around pristine-tar and the associated workflows; I'm not
really an implementation expert, but I'm a proficient user and can
probably help to bridge any remaining gaps.
--
Colin Watson (he/him) [[email protected]]