Hi, Simon.

You have correctly identified a number of challenges faced by the git
transition project.  Yes, we're aware of them.  This is a *transition*
which means not everything can immediately be done the new way.  We
don't think any of these are blockers to the progress we want to make
right now.

Simon Richter writes ("Re: Include git commit id and git tree id in *.changes 
files when uploading?"):
> On 1/6/26 1:59 AM, Ian Jackson wrote:
> > Sadly, "well-maintained" by current (legacy) Debian standards does not
> > mean that the .dsc in Debian even corresponds to upstream git, let
> > alone that it is easy to automatically verify that correspondence.
> 
> There are also a lot of well-maintained packages where the .orig archive 
> does not correspond to upstream git because of omitted files.

Yes.  I think in these situations one should usually use upstream git,
not the tarball.  Typically in these cases the additional stuff in the
tarballs is generated in a more or less ad-hoc way by an upstream
release technician.  We want to rebuild those things anyway, so having
them in our view of the source code is undesirable.

But if the packaging is set up to expect the semi-autogenerated stuff,
changing over to just use actual source code is additional work.

> We also won't be able to stop importing tarballs, because there are 
> projects that *do* release tarballs as artifacts, and sign those.

Yes, if upstream don't sign their git tags but only sign tarballs,
then that might be the right tradeoff.  That's why tools like
gbp import-orig do still have a place.  They just shouldn't be the
usual way.

> Another thing that is difficult to handle are upstreams that use 
> submodules -- for example, the upstream for one of my packages uses 
> submodules that themselves refer to other submodules, and some of the 
> links are relative, some are absolute.

Yes.  Submodules are terrible[1], but some upstreams are using them.
We don't have very good tooling for this yet.

If the submodule is an embedded code copy, or strange OS support or
something, that we in Debian don't want to use, then it can (and
should) just be deleted.  But the other case does exist.

> I currently repack these into a single tarball, with uscan. Not great, 
> not terrible. I want to switch these over to multiple tarballs, each of 
> them generated using git-archive, so we have multiple commit-id 
> annotations, which should make it easier to track where they are coming 
> from.

I think git-debrebase and dgit might be able to support such a
package, with a bit of manual work.  (The resulting Debian git tree
would be merged, not submodules.)  tag2upload currently doesn't
because it doesn't support multiple origs.  I think this thread isn't
the right place for me to sketch out a procedure.

> We will also need to prepare for upstreams to start using sha256 
> objects. Github's unwillingness to support sha256 is buying us some time 
> for now, but git upstream is working on tools to convert an entire 
> repository and provide lookup of sha1 object IDs -- this will be another 
> tooling challenge for us to handle.

Indeed so.

I have been quite frustrated by git upstream's approach to the SHA1
problem.  They designed a new thing, but it was basically undeployable
because there wasn't a transition plan.  It's actually quite helpful
to us for once, to have a corporate behemoth standing in the way.

A decade ago I wrote up an alternative proposal to the git mailing
list but I was told my proposal (which would have allowed a gradual
switchover) was far too complicated and would never come to fruition
and they wanted to do their "simple" thing.  Spoiler for this story:
their "sipple" thing still isn't deployed.

Discouraged, I stopped following upstream activity on this topic.  It
sounds like there may be movement.  It's bound to be painful but we
will cope.

> This makes the choice of tarballs less bizarre, I think: they are not a 
> moving target. To use git as an interface for the archive, we also need 
> to make git less of a moving target, and do so in a way that is not 
> restrictive for upstream authors.

git has been around for 20 years, well over half of Debian's own
lifespan.  During the last 3 decades, we in Debian have made numerous
transitions.  It's a thing we're actually good at (when we can be
bothered).  Perhaps *the* thing we're actually good at.

> I'm also missing a plan how to distribute the source code after the 
> transition is complete. For static files, we have a mirror network, 
> sponsored CDNs and scripts to create CD-ROM images, and I can easily 
> create a local mirror using debmirror if I want.

The number of people who need the source code is much smaller than the
number who need binaries.  So the problem is much less severe.  Also,
git is growing methods for making this kind of thing easier; eg, I was
just reading about some bundle URL feature, which seems to have been
developed to help forge operators (who face similar problems).

But you are right that this is not a completely-solved problem.  We
don't currently know how difficult it will be, but the fundamental
nature of the task is a subset of what's needed for running a forge.
We're currently running one forge already.  Its reliability isn't what
we would like it to be; I don't think that's any reflection on the
team trying to run it - rather, my guess is that it's because it's a
gigantic and buggy pile of corporate shonkware.

And of course we are currently running dgit-repos.  DSA have been very
helpful in keeping it working most of the time despite the ongoing
attacks by AI shysters.

> This is not a problem as long as we are generating .dsc files still.
> 
> Without a plan, we will be doing this indefinitely, or until someone 
> decides that it is too painful to continue, and we (and our users) need 
> to accept a few regressions in order to move forward for the sake of 
> modernity, and there will be a massive flamewar about it.

Certainly I'm not proposing to stop producing .dscs until we can see
that the replacement systems are working well.


I'm not saying all of this is *easy*.  It generally isn't.  dgit has
many thousands of lines of compatibility code.  Sean and I have been
working quite hard on tag2upload.  DSA and others have been giving us
their support.

But we think the problems are soluble, and the benefits are
worthwhile.  Especially, we think the benefits in terms of user and
developer experience are worthwhile.

It's all very well building a Free operating system, but we need our
users to have the actual practical ability to exercise their software
freedom.  That means not just caring about licences.  It means caring
that our users can get shit done without having to tear their hair
out.


Ian.

[1]
   Never use git submodules
   https://diziet.dreamwidth.org/14666.html

-- 
Ian Jackson <[email protected]>   These opinions are my own.  

Pronouns: they/he.  If I emailed you from @fyvzl.net or @evade.org.uk,
that is a private address which bypasses my fierce spamfilter.

Reply via email to