Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Andreas Tille
Hi Russ,

Am Thu, Jun 13, 2024 at 01:54:16PM -0700 schrieb Russ Allbery:
> 
> It would be nice to not do this on the tag2upload server, though, to
> maintain some security separation.

ACK.
 
> Given all of that, I think it would be more promising to look into a
> deeper integration with Salsa to check if the Salsa CI has succeeded, as
> discussed earlier in this thread.  That would also match common upstream
> practice in Git-first development where the workflow for generating the
> release artifact depends on all of the tests passing through the normal CI
> mechanism.

As I said I did not read this thread in total (neither will I manage to
do so soon) but I would really welcome Salsa CI integration into
tag2upload.  It sounds perfectly sensible to me in many ways.

Kind regards
   Andreas.

-- 
https://fam-tille.de



Re: [RFC] General Resolution to deploy tag2upload [and 1 more messages]

2024-06-13 Thread Russ Allbery
> On 6/14/24 00:50, Russ Allbery wrote:

>> This is why people are working on incremental improvements.  I think
>> such improvements are more likely to get us closer to where we want to
>> be than a boil-the-ocean approach that attempts wholesale change to how
>> Debian works.  It's easy to come up with new designs that in theory
>> would be more coherent and straightforward, and very hard in practice
>> to avoid that turning into .

> The reason we have multiple git workflows is because they are
> incremental designs that do not try to change the way Debian works, or
> the way git works.

That may be a reason, but I think the primary reason why we have multiple
Git workflows is because we have a lot of contributors, and many of us
have strong opinions about how things should work and don't agree with
each other.  For example, in this thread you have named as problems some
aspects of our current Git packaging workflows that I quite like and would
be annoyed to lose.

Anyway, most of your comments seem to be orthogonal to this proposal and
are about other things that you want Debian to explore.  Debian is a
volunteer, self-driven project, and I hope no one is stopping from you
exploring those ideas whenever you have a chance.  I'm certainly open to
evaluating a design for a more radical change, particularly if there's a
clear transition plan.  In the meantime, we should not stop incrementally
improving the infrastructure we have today.

> At the very least, we need to make it explicit which repository layout
> is to be used, and version and document that interface, then support it
> for several years in the future even as we make incremental changes,
> because we want to be able to regenerate packages from the git archive.

I believe that dgit-repos already stores a standardized Git representation
of a source package.  This partly addresses your point here, I think.

-- 
Russ Allbery (r...@debian.org)  



Re: [RFC] General Resolution to deploy tag2upload [and 1 more messages]

2024-06-13 Thread Simon Richter

Hi,

On 6/14/24 00:50, Russ Allbery wrote:


We have several 90% solutions of mapping Debian packaging onto git, but
all of these are incomplete and annoying to use because we disagree with
git on what constitutes data, and what constitutes metadata, so the data
model does not match reality or requirements, and from a security
standpoint that concerns me more than improved forensics.



This is why people are working on incremental improvements.  I think such
improvements are more likely to get us closer to where we want to be than
a boil-the-ocean approach that attempts wholesale change to how Debian
works.  It's easy to come up with new designs that in theory would be more
coherent and straightforward, and very hard in practice to avoid that
turning into .


The reason we have multiple git workflows is because they are 
incremental designs that do not try to change the way Debian works, or 
the way git works.


With the current Debian archive we have a well-defined (documented in 
Policy) interface for uploads, and the git workflows are implementation 
details that the archive need not be concerned with. This has allowed us 
to use git in the first place.


By creating an upload service, we elevate git to "interface" status. 
That would be a good thing if there was a single interface. However, we 
have three (that I know of), none of these were designed to talk to 
anything but itself, and the service uses a heuristic to determine which 
one is used.


At the very least, we need to make it explicit which repository layout 
is to be used, and version and document that interface, then support it 
for several years in the future even as we make incremental changes, 
because we want to be able to regenerate packages from the git archive.


Tag2upload is an increment over an increment over something that was not 
designed as an interface, and while each increment is technically sound, 
the overall design needs to be revisited because it needs to support all 
these incremental changes.


I think that with existing git it is difficult to represent the history 
of packages well, because we need to record a history of what are 
effectively rebases, and representing them as a merge paints a wrong 
picture for git, because it assumes that everything upstream of a merge 
is already accounted for.


One _incremental_ change I'd like to see would be archive support for 
.orig.bundle.* (containing a shallow copy of the upstream commit) and 
.debian.bundle.* (containing the differences between the upstream commit 
and the package), which would be an absolute game changer for git 
integration, the archive side would probably be fairly simple to 
implement, and it would allow us to ship the "preferred form for 
modification" for a lot of projects more easily.


Mirrors would still get a size-minimal representation, this format does 
not impose a particular workflow and can be easily generated from and 
validated against the full tree.


   Simon



Re: [RFC] General Resolution to deploy tag2upload [and 1 more messages]

2024-06-13 Thread Sean Whitton
Hello,

On Thu 13 Jun 2024 at 04:13pm +01, Simon McVittie wrote:

> On Thu, 13 Jun 2024 at 15:08:15 +0100, Ian Jackson wrote:
>> I think it is possible that there will be a handful of packages where
>> things are significantly more awkward, which might not be able to
>> adopt tag2upload.
>
> This would presumably be the same minority of packages where maintainers
> use a debian/-only workflow (even if they normally prefer to keep upstream
> source in git) and avoid dgit (even if they normally prefer to use it),
> because the upstream source is too bulky to be convenient to track in git?
> Such as the openarena-data family and other large game assets?
>
> Those packages are already exceptional and already need to be handled
> specially. They'd only be a problem if dgit and/or tag2upload became
> mandatory, which (as far as I understand it) is not the plan.

Yup, those packages -- I myself always think of you and some of your
packages when I think about this particular issue.

-- 
Sean Whitton



Re: Security review of tag2upload [transfer.fsckObjects]

2024-06-13 Thread Russ Allbery
Russ Allbery  writes:

> Or, of course, find a way to disable the author/committer checks, which I
> suspect are most of the failures, and keep the object hash checks.

Apologies, I should have done a bit more research before sending my
message.  Adding the following to my .gitconfig allows me to clone the
coreutils repository:

[fetch "fsck"]
   missingSpaceBeforeDate = ignore

So it would be possible for us to develop a list of problems like this to
ignore for the purposes of tag2upload if we believed they were
unimportant.

I'm a little bit dubious about this.  If I were the dgit-repos archive
maintainer, I would want to enforce an invariant that all repositories
passed git fsck because it feels like an annoying slippery slope to open.
But it would be possible to go this direction.

> The alternative would be to add some sort of support for fsck.skipList,
> but that seems like annoying and arguably unnecessary complexity that
> potentially reintroduces the same security problem via a different
> route.

I see that fsck.skipList explicitly says that corrupt objects cannot be
skipped, so while the rest of this paragraph continues to apply, I am now
less concerned about this introducing security issues.

-- 
Russ Allbery (r...@debian.org)  



Re: Security review of tag2upload [transfer.fsckObjects]

2024-06-13 Thread Russ Allbery
Simon Josefsson  writes:

> I have had that settings in my .gitconfig for several years, and the
> number of git repositories that fail to clone with it is not negligible.
> I encourage people to enable it and experiment for themselves.  Try
> https://git.savannah.gnu.org/git/coreutils.git for example.  I hack
> around it by adding a 'fclone' alias, and I still need to use it once in
> a while.

> [transfer]
>   fsckObjects = true
> [alias]
>   fclone = clone -c "fetch.fsckObjects=false"

I've been using this setting for over four years (I think maybe about six
years) and have never had a failure before this email message, just to add
another data point.  It may depend on whether you work with a lot of Git
repositories from very early adopters.

I started using this setting when I discovered a Git repository at work
that had become silently corrupted due to disk or memory corruption and
went undetected for a disturbing length of time.

I agree that the coreutils repository does fail:

error: object a6727941433ee1c91a20ede6cb381af1d18c566d: missingSpaceBeforeDate: 
invalid author/committer line - missing space before date

(That also confirms my suspicion that if you set this Git configuration
setting, git clone will fail if it detects any invalid objects.)

I don't know if there is a setting that disables the checks for more
pendatic issues such as malformed author/committer lines but retains the
verification of object hashes.

> Maybe the number of repositories on Salsa with this problem is low, but
> isn't the tag2upload design vulnerable to upstream git repositories
> having this problem too?

I think it would be fine for tag2upload to refuse to work on repositories
that cannot pass git fsck.  I know there are some good reasons why people
decline to fix old repositories with things that Git considers to be
corruption, but tag2upload is not intended to be universal and is strictly
optional and already doesn't support some things that people may wish to
do.  This feels like a reasonable one to add to that list, to me.

The alternative would be to add some sort of support for fsck.skipList,
but that seems like annoying and arguably unnecessary complexity that
potentially reintroduces the same security problem via a different route.

Or, of course, find a way to disable the author/committer checks, which I
suspect are most of the failures, and keep the object hash checks.

-- 
Russ Allbery (r...@debian.org)  



Re: Security review of tag2upload [transfer.fsckObjects]

2024-06-13 Thread Simon Josefsson
Russ Allbery  writes:

>> Can this be substantiated?  Using SHA1CD in Git does not necessarily
>> mean someone cannot manually create a Git repository with a colliding
>> git commit somewhere in the history that gets accepted by git, and
>> allows someone to replace actual file contents.  That may be the case,
>> but I haven't seen any detailed analysis answering that.
>
> This was a really interesting point that I didn't catch.  Thank you!  Let
> me try to rephrase this in the form of an attack and see if this captures
> what you were getting at.
>
> The attack: Using a pre-SHA-1DC version of Git, construct a benign and
> malicious pair of Git trees that diverge at some point by abusing the hash
> of an object vulnerable to SHAttered.

Right, and that this happened at some point in the past rather than on
the git tag commit id.  This git repository could be the upstream or the
debian git repository.

> Push the benign tree to Salsa, relying on Salsa not reverifying the
> hashes of new objects with a hardened hash, or alternately have
> already planted the benign tree in a Git repository imported into
> Salsa before SHA-1DC was in use.  Get that tree signed by a sponsor,
> again relying on the sponsor's git client not revalidating object
> hashes, and then follow the same attack pattern in either "Moving the
> tag" or "Replacing the upstream tree."  Rely on the tag2upload server
> not reverifying the hashes of the Git tree when it pulls it to
> construct the signed source package.
>
> In essence, this attack exploits the fact that Git is lazy about
> performing hashes and usually only does so when it has to.  I'm not sure
> this assumption is correct for Salsa in particular, but it's at least
> plausible.  The trees used in this attack would fail git fsck, because the
> critical object would hash to a different value using SHA-1DC than it does
> with SHA-1, but it's not clear that git fsck is called at any of the
> points that would detect this attack.
>
> I believe this attack would be prevented by setting transfer.fsckObjects
> to true in the Git configuration of the tag2upload worker and failing the
> operation if it detects anything.  (Or, equivalently, calling git fsck
> after git clone and failing on any detected problems.)  I believe this
> forces recomputation of the hashes of all received objects.  The object
> used in this attack would fail that hash recomputation because the
> tag2upload server would use a version of Git that uses SHA-1DC.  The cost
> is a performance penalty on git clone, which would be trivial for most
> repositories but which might be noticable for particularly large Git
> trees.
>
> My personal opinion is that always setting transfer.fsckObjects to true is
> good practice anyway to catch more banal problems such as disk corruption
> and memory bit flips, so while I'm not sure I would bother just for this
> attack, it might be a good idea on general principles.

I have had that settings in my .gitconfig for several years, and the
number of git repositories that fail to clone with it is not negligible.
I encourage people to enable it and experiment for themselves.  Try
https://git.savannah.gnu.org/git/coreutils.git for example.  I hack
around it by adding a 'fclone' alias, and I still need to use it once in
a while.

[transfer]
fsckObjects = true
[alias]
fclone = clone -c "fetch.fsckObjects=false"

Maybe the number of repositories on Salsa with this problem is low, but
isn't the tag2upload design vulnerable to upstream git repositories
having this problem too?  In the part of tag2upload that re-generate the
*.orig.tar.gz file.  Maybe I'm missing how that is supposed to work,
relying on debian/watch does not strongly/uniquely identify a release
artifact.  Many distributions store SHA256 hashes of the expected
upstream release tarball, and this is a good practice that Debian
doesn't support to my knowledge.

/Simon


signature.asc
Description: PGP signature


Re: source tarballs vs. source from git

2024-06-13 Thread Sean Whitton
Hello,

On Fri 14 Jun 2024 at 04:42am +09, Mike Hommey wrote:

> On Thu, Jun 13, 2024 at 08:28:28PM +0800, Sean Whitton wrote:
>> Hello,
>>
>> On Wed 12 Jun 2024 at 03:45pm +01, Simon McVittie wrote:
>>
>> > As far as I know, git-archive (and therefore git-deborig) doesn't guarantee
>> > that repeatedly archiving the same git tree produces the same tarball,
>> > which could be awkward for the ftp archive's tarball-integrity-based rules;
>> > but hopefully tag2upload would insulate individual developers from that by
>> > always "doing the right thing" for the current contents of the archive?
>>
>> Just to note quickly that yes, it does.
>
> A given version of git does. It's not guaranteed that a newer version of git
> won't produce something different (such a change happened recently-ish).

Sorry -- I meant that tag2upload does the right thing in pulling from
the archive if there is an orig there.

-- 
Sean Whitton



Re: Security review of tag2upload

2024-06-13 Thread Sam Hartman
> "Russ" == Russ Allbery  writes:

Russ> The attack that Simon is talking about doesn't require a
Russ> preimage attack, only a successful collision attack against
Russ> Git trees using SHAttered plus some assumptions about where
Russ> Git may be lazy about revalidating hashes.  It's an
Russ> interesting point that I didn't think of, although I'm not
Russ> sure that it would work against GitLab and thus against Salsa
Russ> and I think it's fairly trivial to protect against regardless.

Russ, I'm trying and failing to find time to  write a long response to
your security review.

But I'm going to try to make this point because I think it is relevant
and because it came up a number of times when I'm thinking about your
analysis.

You talk a number of times about whether an attack is possible against
salsa.  But especially when thinking about detection and tracing, I
think that things that are verified by signatures made with keys not
held by the system in question are harder to modify than things that can
be verified only so long as a system remains trusted.
One of the things that I found striking in the xz-utils attack was how
two different systems (the release tarballs) and git archives might have
been exploited to hide an attack.  If people looked at git and not the
archive, they might conclude there was no attack, even if they had been
alerted that there might be a problem.

Which is to say, especially in the moment when considering an incident,
people are very bad about reasoning about whether views of a system are
equivalent.

So, I consider the following to be useful to an attacker--to be threats
worth mitigating:

1) Attacker uploads malicious code to the archive.

2) Attacker possibly through a compromise of the dgit server and salsa
changes the git view to be something harmless.

Such an attack can be detected by regularly verifying  the archive
contents against git versions.
I do not have confidence that verification will hapen.
At least in my experience if I have a choice between git and unpacking a
dsc, I'll take git almost all the time.
I realize there are DDs who prefer the dsc.

Still, my initial read of your analysis is that you discount attacks
like this more than makes sense to me.
I also believe that hash collisions may make attacks like the above more
possible.

My strong suspicion  is that even if the classes of attack I am thinking
about are given the consideration I think they need, tag2upload will
still be reasonable from a security architecture standpoint.


signature.asc
Description: PGP signature


Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Russ Allbery
Andreas Tille  writes:

> From a user perspective some intermediate binary build wouldn't be more
> difficult, thought.

It would be nice to not do this on the tag2upload server, though, to
maintain some security separation.

I think it's possible to avoid running arbitrary code from the package
during a source package build because tag2upload doesn't need to run the
clean step since it's starting from a fresh Git checkout (please check me
on this).  If I'm correct, it's much harder to attack the source package
worker than it is to attack a buildd, which is arbitrary code execution
from the package by design.  You have to find an exploit in the source
package construction code first.

Yes, the binary build would be sandboxed, of course, but Linux sandboxing
isn't perfect.  I would feel more comfortable if binary builds were done
somewhere without any access to the tag2upload signing key, even via a
sandbox break.

I may be over-solving this problem given that the same sandbox break
attack could probably be turned into a persistent compromise of an amd64
buildd, which would be arguably worse than compromising the tag2upload
server.  But still, binary builds are inherently risky and the more
sandboxing we can put around them, the better.  Ideally we would run them
in disposable VMs that we reset after each build.

Given all of that, I think it would be more promising to look into a
deeper integration with Salsa to check if the Salsa CI has succeeded, as
discussed earlier in this thread.  That would also match common upstream
practice in Git-first development where the workflow for generating the
release artifact depends on all of the tests passing through the normal CI
mechanism.

-- 
Russ Allbery (r...@debian.org)  



Re: Security review of tag2upload

2024-06-13 Thread Russ Allbery
Marco d'Itri  writes:
> si...@josefsson.org wrote:

>> Can this be substantiated?  Using SHA1CD in Git does not necessarily
>> mean someone cannot manually create a Git repository with a colliding
>> git commit somewhere in the history that gets accepted by git, and
>> allows someone to replace actual file contents.  That may be the case,
>> but I haven't seen any detailed analysis answering that.

> This is quite a strong assertion, and it is up to you to prove it.  The
> current consensus among cryptography experts is that SHA-1 is still
> resistant to preimage attacks.

The attack that Simon is talking about doesn't require a preimage attack,
only a successful collision attack against Git trees using SHAttered plus
some assumptions about where Git may be lazy about revalidating hashes.
It's an interesting point that I didn't think of, although I'm not sure
that it would work against GitLab and thus against Salsa and I think it's
fairly trivial to protect against regardless.  I'm working on a longer
response; I needed to do a bit of research first.

-- 
Russ Allbery (r...@debian.org)  



Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Andreas Tille
Hi Ian,

Am Thu, Jun 13, 2024 at 12:47:35PM +0100 schrieb Ian Jackson:
> Andreas Tille writes ("Re: [RFC] General Resolution to deploy tag2upload"):
> > That means some package build process is done before the source
> > package is forwarded to dak and sends some e-mail back?
> 
> Only a source package build.

Thank you for the clarification.
 
> > I know we have this.  My point is that tag2upload users might forget
> > to use it before using tag2upload service.  I simply want to make
> > sure that tag2upload is not another way to upload anything that does
> > not build on buildservices.
> 
> I'm afriad that tag2upload is precisely another way to do that.
> 
> That's because that's how uploading works now, and tag2upload is
> another way to make an upload.  Uploads must be source-only nowadays
> (in most cases).  So there is, by design, nothing in the existing
> setup that ensures that a maintainer built binaries.

That's correct.

> (I get the
> feeling that you're not happy with this situation, but that's how
> Debian is now, and I think it's a jolly good thing.)

I wanted to clarify whether this is the case.  At least I would see the
potential in tag2upload to do some intermediate step.  I do not really
know whether its worth burning lots of CPU cycles for an extra binary
build since I trust my fellow developers to do what they are expected to
do.  However, there might be situations where people might make mistakes
or really assume that something builds after a really slight, but
breaking change.
 
> You might argue that tag2upload makes this worse because it makes it
> easier to perform uploads.  It certainly *does* make it easier to
> perform uploads.  That's a big part of the point.

I perfectly understand this.  If I would consider this bad or good I
would have said so.  I'm simply lacking experience who many broken
source-only uploads are hitting dak and we will see whether this number
might increase in case tag2upload might become established (which I
hope).
 
> I think this can only be a *downside* if you think it is a good
> thing that uploading is difficult.

>From a user perspective some intermediate binary build wouldn't be more
difficult, thought.  I think we could make things more safe by the
expense of extra power consumption and for large packages (which could
be white-listed) extra delay.  This is by no means any pro or con
argument.  Just wanted to throw in that idea for comments.

Kind regards and thanks for all the effort in tag2upload
   Andreas.

-- 
https://fam-tille.de



Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Gunnar Wolf
Sean Whitton dijo [Thu, Jun 13, 2024 at 05:42:25AM +0800]:
> > Actually, we can set acls on fingerprints and then that key wont be able
> > to upload anymore. That is not something recorded in the keyrings or the
> > DM list. Obviously that is not something used often (really really
> > seldom), it is more for "this key is compromised badly, please turn off
> > anything with it *NOW*" situations, which it's what Helmut meant with the
> > urgent cases.
> 
> Could you say more specifically how seldom, and also how long it usually
> takes between you flicking the emergency switch, and the keyring team
> pushing an update?

Quite hard to say.

We have tried to cover differnt timezones between the (currently)
three of us in keyring-maint, but it's not that uncommon we are all in
North America. Sadly, it's not as common as I'd wish that we are all
at DebConf.

Usually, when we are notified of a compromised key (or keys that have
to be urgently removed for urgent reasons), we act on it as soon as
one of us can take it, and the keyring preparation + update + push
process takes about one hour, tops. But there can be many reasons the
three of us (keyring-maints) are unreachable for several hours.



Re: Security review of tag2upload

2024-06-13 Thread Russ Allbery
Thank you very much for this review!  You were one of the people I most
wanted to hear from, since I know you have substantial expertise in the
cryptography parts of this and a lot of security experience.  I know you
also had concerns in the past about hash collisions specifically.

Simon Josefsson  writes:

> Generally I reach the same conclusion, although I think there are real
> security problems with both the existing and the proposed tag2upload
> mechanism that we should all be aware of.

Yes.  tag2upload doesn't fundamentally change the security model of
uploads; most flaws that we currently have we still have with tag2upload.

> It is acceptable to realize that we cannot protect against all attacks
> with reasonable costs.  That's why we need the ability to transparently
> audit all steps, to detect them when they occur.  Reversely: it would be
> unfortunate to say no to new functionality because the new functionality
> don't solve all possible problems.  That just stalls progress.

Wholeheartedly agreed.

>> ## Threat model
>>
>> I evaluated both the existing source package upload architecture and the
>> tag2upload architecture against the following threats:
>>
>> - Someone not in the keyring uploads a malicious source package, possibly
>>   via a sponsor.
>>   
>> - Someone in the keyring (either a Debian Developer or a Debian Maintainer
>>   for a package) uploads a malicious source package but makes it appear
>>   that the package was uploaded by someone else in the keyring.
>>
>> - An attacker compromises the system a Debian uploader uses to build
>>   source packages and uses that access to inject malicious code into a
>>   source package.
>>
>> - Someone with administrative access to the archive processing machinery
>>   (DAK, the archive signing key, or similar infrastructure) uploads a
>>   malicious source package.
>>   
>> - Someone with administrative access to the tag2upload server or its
>>   signing key uploads a malicious source package.
>>   
>> - Someone with administrative access to Salsa uploads a malicious source
>>   package.

> Having a threat model is great.  I find the notion of "uploads a source
> package" is poorly defined here though.

Yes, I used this sloppily.  In some places I mean uploads the package to
dak, in some places I mean pushed a signed Git tag, and in other places I
mean introduces the package into the archive.  I'll try to find some time
to tighten up the wording to be a bit clearer about which specific action
I'm talking about in each case.

> What threat model of those (if any) cover the situation were someone in
> the keyring uploads a (benign) source package and something on Debian's
> side (e.g., design of tag2upload) enables an attacker to substitute some
> part of the intended upload with something malicious?

It's effectively equivalent to:

- Someone with administrative access to the tag2upload server or its
  signing key uploads a malicious source package.

and is discussed at some length in the corresponding section, but I should
call that out as a separate threat.

>> ### Git object collisions
>>
>> The current Git repository format and wire protocols use SHA-1 hash
>> digests (and only SHA-1 hash digests) to identify objects in the Git
>> repository. Git uses a SHA-1 hash function that has been
>> [hardened against the SHAttered attack on 
>> SHA-1](https://github.com/cr-marcstevens/sha1collisiondetection),
>> and therefore is probably not vulnerable to known collision attacks.

> Can this be substantiated?  Using SHA1CD in Git does not necessarily
> mean someone cannot manually create a Git repository with a colliding
> git commit somewhere in the history that gets accepted by git, and
> allows someone to replace actual file contents.  That may be the case,
> but I haven't seen any detailed analysis answering that.

This was a really interesting point that I didn't catch.  Thank you!  Let
me try to rephrase this in the form of an attack and see if this captures
what you were getting at.

The attack: Using a pre-SHA-1DC version of Git, construct a benign and
malicious pair of Git trees that diverge at some point by abusing the hash
of an object vulnerable to SHAttered.  Push the benign tree to Salsa,
relying on Salsa not reverifying the hashes of new objects with a hardened
hash, or alternately have already planted the benign tree in a Git
repository imported into Salsa before SHA-1DC was in use.  Get that tree
signed by a sponsor, again relying on the sponsor's git client not
revalidating object hashes, and then follow the same attack pattern in
either "Moving the tag" or "Replacing the upstream tree."  Rely on the
tag2upload server not reverifying the hashes of the Git tree when it pulls
it to construct the signed source package.

In essence, this attack exploits the fact that Git is lazy about
performing hashes and usually only does so when it has to.  I'm not sure
this assumption is correct for Salsa in particular, but it's at least

Re: source tarballs vs. source from git

2024-06-13 Thread Mike Hommey
On Thu, Jun 13, 2024 at 08:28:28PM +0800, Sean Whitton wrote:
> Hello,
> 
> On Wed 12 Jun 2024 at 03:45pm +01, Simon McVittie wrote:
> 
> > As far as I know, git-archive (and therefore git-deborig) doesn't guarantee
> > that repeatedly archiving the same git tree produces the same tarball,
> > which could be awkward for the ftp archive's tarball-integrity-based rules;
> > but hopefully tag2upload would insulate individual developers from that by
> > always "doing the right thing" for the current contents of the archive?
> 
> Just to note quickly that yes, it does.

A given version of git does. It's not guaranteed that a newer version of git
won't produce something different (such a change happened recently-ish).

Mike



Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Didier 'OdyX' Raboud
Le jeudi, 13 juin 2024, 08.23:45 h CEST Thomas Goirand a écrit :
> One thing I really dislike, is having a single gpg key to upoload them
> all. I very much preferred the design that Didier explained during
> Debconf Kosovo, where the .changes signature is uploaded together with
> the tagged commit.

Thomas is referring to my very rough proof-of-concept with specially-crafted 
tags I had quickly presented at DebConf22: first lightning talk, video'ed 
here: https://debconf22.debconf.org/talks/41-lightning-talks/

I had called this "dtag": the idea was to store an inline representation of 
the .dsc and .changes you would produce locally in `git notes` that get 
attached to git tags (to avoid storing very large amounts of signatures in a 
git tag); the uploader pushes the tag, and a job in Salsa CI (or whatever 
tool) then takes the tag, reconstructs the .dsc and .changes, and then (as 
they're validly signed), directly uploads them.

Source is there:
  https://salsa.debian.org/odyx/dtag

A "signing job" is there:
  https://salsa.debian.org/debian/libopenaptx/-/jobs/3020729

I never progressed further than the hacked-around scripts I had done towards 
that lightning talk, mostly for these reasons:

* it's fragile: it depends on bit-by-bit reproducibility of source _and_ 
changes files between what the uploader would do on their machine, and what 
the git tag notes processor does. As the dgit & tag2upload teams have 
demonstrated, there are _many ways_ this can (and will) break.
* while I like the "only upload if the CI passes" feature, I'm not very happy 
with the fact that as uploader, I already granted the right to upload (as all 
the material to generate a signed source package is pushed) before the CI ran 
successfully: an evil bypasser could use my git tag+notes and upload anyway, 
as signed with my key. I don't think its unique to that implementation though: 
all signed processes around git tags have the same issue.
* Debian has fallen behind in my priorities, so I'm uploading less, and 
polishing this has not been on my todo list.
* generating the special git notes is bulky, and requires local software, that 
is not-pretty python3.

By no means do I claim ownership of the idea or the design, or even on the 
proof-of-concept code; if that is seen as worthwhile, I'd be honoured to see 
anyone building on top of this idea, design, or code!

-- 
OdyX





Re: Security review of tag2upload

2024-06-13 Thread Marco d'Itri
si...@josefsson.org wrote:

>Can this be substantiated?  Using SHA1CD in Git does not necessarily
>mean someone cannot manually create a Git repository with a colliding
>git commit somewhere in the history that gets accepted by git, and
>allows someone to replace actual file contents.  That may be the case,
>but I haven't seen any detailed analysis answering that.
This is quite a strong assertion, and it is up to you to prove it.
The current consensus among cryptography experts is that SHA-1 is still
resistant to preimage attacks.

-- 
ciao,
Marco



Re: [RFC] General Resolution to deploy tag2upload [and 1 more messages]

2024-06-13 Thread Russ Allbery
Simon Richter  writes:

> We might get additional insights after a breach, perhaps, if Github
> decide to take a compromised repository offline and our copy is still
> accessible.

I think this is way more useful than you are making it sound.  Upstream
may not use GitHub at all and instead use their own personal Git servers.
The malicious code may be introduced by a Debian contributor directly in
Debian.  Upstream may have tampered with their Git repository using force
push or the like in a way that causes the desired trace information to be
lost.  There are numerous cases where Debian having its own archive of the
Git history when investigating a compromise would be immensely valuable.

> We have several 90% solutions of mapping Debian packaging onto git, but
> all of these are incomplete and annoying to use because we disagree with
> git on what constitutes data, and what constitutes metadata, so the data
> model does not match reality or requirements, and from a security
> standpoint that concerns me more than improved forensics.

This is why people are working on incremental improvements.  I think such
improvements are more likely to get us closer to where we want to be than
a boil-the-ocean approach that attempts wholesale change to how Debian
works.  It's easy to come up with new designs that in theory would be more
coherent and straightforward, and very hard in practice to avoid that
turning into .

One of the important properties of tag2upload is that it meets people
where they are and tries to support their existing workflows while
providing a more systematic source package construction algorithm.

-- 
Russ Allbery (r...@debian.org)  



Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Ian Jackson
Scott Kitterman writes ("Re: [RFC] General Resolution to deploy tag2upload"):
> On June 13, 2024 3:02:48 PM UTC, Joerg Jaspert  wrote:
> >I think this is a minor issue, actually. It does not happen often. For
> >the time it will, we can have something like "ftpmaster pushes a list of
> >fingerprints via $mechanism" (ssh forced command is widely used for
> >similar things, for example).
> >
> >That's really simple to implement.
> 
> I agree that this isn't a major design issue, but I think it is something 
> that I think needs to be addressed before deployment of tag2upload.  The need 
> is certainly rare, but when it's needed, it's needed because it's important.

I agree.  Also, I don't want to be developing a new shutoff mechanism
during an emergency.  Instead, I have filed #1073157.

I think this should be addressed regardless of t2u, since it affects
current dgit use too.

Russ's suggested resolution is reasonble too, but I don't think it's
sufficient because I want to prevent bad stuff appearing on
*.dgit.do.o, not just in archive.d.o.  Either or both of these
approaches would work.

> It also suggests to me that it's premature to freeze and mandate the current 
> design via GR.

This is a minor detail, easily sorted out.

I don't think passing this GR forbids us from updating the design to
address points like this.  I think it *does* forbid us from updating
the design in ways that Russ and Noodles disapprove of.  But that's
surely right and proper.

Ian.

-- 
Ian JacksonThese opinions are my own.  

Pronouns: they/he.  If I emailed you from @fyvzl.net or @evade.org.uk,
that is a private address which bypasses my fierce spamfilter.



Re: [RFC] General Resolution to deploy tag2upload [and 1 more messages]

2024-06-13 Thread Simon McVittie
On Thu, 13 Jun 2024 at 15:08:15 +0100, Ian Jackson wrote:
> I think it is possible that there will be a handful of packages where
> things are significantly more awkward, which might not be able to
> adopt tag2upload.

This would presumably be the same minority of packages where maintainers
use a debian/-only workflow (even if they normally prefer to keep upstream
source in git) and avoid dgit (even if they normally prefer to use it),
because the upstream source is too bulky to be convenient to track in git?
Such as the openarena-data family and other large game assets?

Those packages are already exceptional and already need to be handled
specially. They'd only be a problem if dgit and/or tag2upload became
mandatory, which (as far as I understand it) is not the plan.

smcv



Re: Security review of tag2upload

2024-06-13 Thread Simon Josefsson
Simon Richter  writes:

> Hi,
>
> On 6/13/24 22:27, Simon Josefsson wrote:
>
>> Generally I reach the same conclusion, although I think there are real
>> security problems with both the existing and the proposed tag2upload
>> mechanism that we should all be aware of.  It is acceptable to realize
>> that we cannot protect against all attacks with reasonable costs.
>
> In that case it is kind of disingenuous to highlight the necessity of
> this change by pointing at the xz-utils scenario.

Agreed.  I don't think tag2upload solves anything important from a
security point of view.  I believe tag2upload will enable new attacks,
some attacks that are realistic and will actually occur.  Still I find
myself in mild support of tag2upload, since it enables a workflow that
some people seems to prefer.  IMHO, that's the important aspect.

Excluding people's reasonable positions has demotivated Debian
contributors historically, and I don't think the project or resulting
release artifact is significantly better off as a result.  Using Devuan
to avoid systemd, or Trisquel to avoid non-free software, is poor human
resource utilization.

Git-based workflows seems popular.  If there is some method to support
it (tag2upload), and there are people willing to baby-sit the
implementation (I dunno but assume so), and it doesn't break existing
workflows (I dunno), then my opinion is: why not.

But, please, don't hype this as a solution to xz-utils problems.  The
ftpmaster's conservative response is reasonable, and there are many
unanswered questions about tag2upload, and it is easy to shoot it down
on those grounds.  It would make the case stronger to admit that there
are unanswered questions, and would invite collaborative work to improve
the design.

/Simon


signature.asc
Description: PGP signature


Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Ian Jackson
Antoine Beaupré writes ("Re: [RFC] General Resolution to deploy tag2upload"):
> Well, isn't tag2upload part of dgit? Or at least git-debpush, the binary
> package, seems to be part of the dgit source package here... we're also
> talking frequently about dgit.debian.org as part of this infrastructure,
> so clearly this whole thing is kind of a part of dgit...

Rather, dgit is part of the implementation strategy for tag2upload,
because dgit is the only program that knows how to canonicalise
git trees and turn them into to source packages.

And the t2u code lives in src:dgit because src:dgit has lots of useful
infrastructure, eg for testing involving git and source packages.

> I am not sure saying those things are completely separate here is
> helpful, it would be more useful to clarify exactly what component we're
> adopting and what patterns we need to change if we want to adopt
> this. For example, does this respect DEP-14? Which parts?

tag2upload doesn't concern itself with the ref namespace, so those
parts of DEP-14 aren't relevant.

tag2upload *does* use DEP-14 tag naming, including DEP-14 version
mangling.  (Indeed as part of our git integraton work, we enhanced
DEP-14's mangling to cover a missing edge case.)

https://manpages.debian.org/testing/git-debpush/tag2upload.5.en.html#GIT_METADATA

> > tag2upload and dgit have many additional safety checks that help avoid
> > mistakes.  For example, you can be sure that the git tree you are
> > about to upload is precisely what ends up in the archive - so you can
> > rely on git diff and never need to run debdiff on source packages.
> > It is much harder to accidentally undo an NMU.  etc.
> 
> This brings another question to mind. Right now, I understand that some
> people use dgit for NMUs, on packages they do not own. Does this
> workflow still support the old NMU process where i get a debdiff with an
> upload someone makes for me, or if someone opts in to this process, for
> an NMU, *I*, as a maintainer, now have to figure out dgit? :)

Nothing changes for you.

An NMUer, whether they use dgit or not, is supposed to send you the
diff.  (An NMUer who does their work with dgit will probably use
git-diff or git-format-patch to generate the diffs they send to the
BTS.)

With tag2upload you apply the NMU diff to your git tree in whatever
way you do so currently.

An NMUer can use tag2upload, too.  That doesn't interfere with your
use of it.

> >> 2. does this scale to the archive?
...
> > There is one singleton service push.dgit.d.o, ...
> > Non-uploading clients use {browse,git}.dgit.d.o.  ...
> 
> Are those two different hosts with their own replicas of the git repos?

Yes.  And if browse.* and git.* become too busy, so need to be scaled
up, there will be even more replicas.

> Because then that means we have *three* replicas (push.dgit.d.o,
> browse.dgit.d.o and salsa.d.o) of those repositories...

We have at least five.  You forgot archive.debian.org and
snapshot.debian.org.

(And that's not even counting the maintainer's laptop, temporary
clones/copies on salsa and ci.d.n and buildds, and all the copies
upstream have.)

> > Ultimately, *.dgit.d.o is in some sense a competitor to
> > archive.debian.org, but I don't see us abolishing archive.d.o.
> > Instead, tag2upload is getting us further towards on dual running,
> > where we accept either source packages or git trees, and publish both.
> 
> hmmm... maybe I'm missing something, but archive.d.o also has binary
> packages, dgit.d.o doesn't do that, does it? Or are you only refering to
> the source packages part?

Yes, you're right, only the source packages part.

Ian.

-- 
Ian JacksonThese opinions are my own.  

Pronouns: they/he.  If I emailed you from @fyvzl.net or @evade.org.uk,
that is a private address which bypasses my fierce spamfilter.



Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Timo Röhling

Hi,

* Luca Boccassi  [2024-06-13 14:51]:
I was not, I wasn't suggesting to make this a hard requirement, as 
you say that's more complicated. Merely moving the fire-and-forget 
webhook as the last stage of the pipeline, as the default 
setting/setup/config/whatever. This is not to provide strong 
guarantees, but merely an easy default that encourages a QA pass
first. Then maintainers can override the pipeline config and skip 
it, if they don't want it for any reason. If it was the default, I 
suspect de-facto the majority of uploads would go through it, and 
we would gain in quality, on average (exceptions apply, etc etc).


Sure, having the option and even having the option as default is 
perfectly fine. If I can instruct tag2upload not to wait for the CI 
on certain uploads, I might even leave it on ;)


Cheers
Timo


--
⢀⣴⠾⠻⢶⣦⠀   ╭╮
⣾⠁⢠⠒⠀⣿⡁   │ Timo Röhling   │
⢿⡄⠘⠷⠚⠋⠀   │ 9B03 EBB9 8300 DF97 C2B1  23BF CC8C 6BDD 1403 F4CA │
⠈⠳⣄   ╰╯


signature.asc
Description: PGP signature


Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Russ Allbery
Scott Kitterman  writes:

> I agree that this isn't a major design issue, but I think it is
> something that I think needs to be addressed before deployment of
> tag2upload.  The need is certainly rare, but when it's needed, it's
> needed because it's important.

I don't understand why this would be a blocker given that dak can redo the
authorization check at the same point that it does authorization checks
now, should it so desire.  This does require a small change to dak to
retrieve the key fingerprint from the source package in the case where the
source package is signed with the tag2upload key, but that doesn't seem
too difficult.

-- 
Russ Allbery (r...@debian.org)  



Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Scott Kitterman



On June 13, 2024 3:02:48 PM UTC, Joerg Jaspert  wrote:
>On 17259 March 1977, Ian Jackson wrote:
>
>>> Thanks.  Then possibly it is sufficient for ftpmaster just to disable
>>> tag2upload's whole key until the keyring update is pushed.
>> I'm not sure this is a sufficient answer.  We don't want uploads by
>> revoked keys to appear on *.dgit.d.o either.
>
>> Joerg, is there some way that this fingerprint block information could
>> be made available in a more timely manner?  Ideally we would update
>> push.dgit.d.o to use this information, regardless of tag2upload.
>> (And the t2u conversion system should use it too.)
>
>> I think maybe we should take this to a different venue, than this
>> thread on -vote.  How about a bug against ftp.d.o and/or
>> dgit-infrastructure ?
>
>I think this is a minor issue, actually. It does not happen often. For
>the time it will, we can have something like "ftpmaster pushes a list of
>fingerprints via $mechanism" (ssh forced command is widely used for
>similar things, for example).
>
>That's really simple to implement.

I agree that this isn't a major design issue, but I think it is something that 
I think needs to be addressed before deployment of tag2upload.  The need is 
certainly rare, but when it's needed, it's needed because it's important.

It also suggests to me that it's premature to freeze and mandate the current 
design via GR.

Scott K



Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Joerg Jaspert

On 17259 March 1977, Ian Jackson wrote:


Thanks.  Then possibly it is sufficient for ftpmaster just to disable
tag2upload's whole key until the keyring update is pushed.

I'm not sure this is a sufficient answer.  We don't want uploads by
revoked keys to appear on *.dgit.d.o either.



Joerg, is there some way that this fingerprint block information could
be made available in a more timely manner?  Ideally we would update
push.dgit.d.o to use this information, regardless of tag2upload.
(And the t2u conversion system should use it too.)



I think maybe we should take this to a different venue, than this
thread on -vote.  How about a bug against ftp.d.o and/or
dgit-infrastructure ?


I think this is a minor issue, actually. It does not happen often. For
the time it will, we can have something like "ftpmaster pushes a list of
fingerprints via $mechanism" (ssh forced command is widely used for
similar things, for example).

That's really simple to implement.

--
bye, Joerg



Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Ian Jackson
Antoine Beaupré writes ("Re: [RFC] General Resolution to deploy tag2upload"):
> On 2024-06-13 12:38:36, Ian Jackson wrote:
> > Antoine Beaupré writes ("Re: [RFC] General Resolution to deploy 
> > tag2upload"):
> >> 3. what does this mean for salsa/jenkins/bts/etc?
> >
> > Nothing.
...
> > I don't think I have an opinion about that.  (Or at least, maybe I do,
> > but it's not relevant.)
> 
> I do think it's relevant.  [...]
> 
> You, I suspect, have a bias as well. If you don't state it clearly,
> people will (and have, already!) speculate as to what your underlying
> intentions are. Sean, for example, has clearly stated he likes Salsa and
> wants it to stick around, which probably will comfort people who worry
> about this.

tag2upload doesn't care about Jenkins vs gitlab CI; doesn't care about
BTS vis gitlab issues; and doesn't care about wiki systems.

But, you're right, I do have biases.  My two main biases here:

 1. I think git is great and we should be using it much more.  I think
source packages, which I designed decades ago (and which others
have since added features to), are weird, buggy, and obsolete.
I wish I would never have to deal with source packages.

 2. But, I very much don't want to impose things on anyone.  Debian is
only fun when we all have our autonomy.  Also Debian is very big
and different situations call for different solutions.

So, I try to provide software which people will love to use.

At the risk of derailing things:

Unlike certain other camps in Debian, I definitely don't want to force
anyone to use my software.  I've put my reputation on the line, and
fought many very horrible fights in Debian, to try to help preserve my
co-developers and users' technological autonomy.  My ideology about
transitioning to git is no different.

I see the fear others have here, that the dgit and tag2upload projects
are somehow a plot to force everyone into using some weird thing I
invented.  Given Debian's overall attitude, and past crises and
events, that is a reasonable fear.

But no-one has anything to fear from *me* on that point.

I say this even though I think currently mainstream practices in
Debian as a whole fail to properly provide our users with the source
code.  IMO we, as a project, are grievously failing to meet our core
objectives.

My answer to this is to try to provide tools that enable us all to do
our work, effectively, and also meet our ideological obligations.
dgit is part of that.  tag2upload is the next step.

My biases are hardly secret.  Try maybe these two blog posts:
  https://diziet.dreamwidth.org/17579.html
  https://diziet.dreamwidth.org/9556.html
They are aimed at our non-Debian-expert users.  They contain
apologies.  I don't want to have to apologise on Debian's behalf.
So I'm trying to make it easy for everyone in Debian to do better.

> I think if you, in particular, would speak your mind about this, it
> could help alleviate some of those concerns, or at least clarify the
> scope of concerns people should have. :p

HTH.

Ian.

-- 
Ian JacksonThese opinions are my own.  

Pronouns: they/he.  If I emailed you from @fyvzl.net or @evade.org.uk,
that is a private address which bypasses my fierce spamfilter.



Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Luca Boccassi
On Thu, 13 Jun 2024 at 14:34, Timo Röhling  wrote:
>
> Hi,
>
> * Luca Boccassi  [2024-06-13 14:23]:
> >As far as I understand in the current proposal the trigger is a
> >webhook running on Salsa after a push - have you considered instead
> >having the trigger be a stage in the salsa-ci pipeline, that would run
> >after the previous stages have completed successfully?
> I hate that idea. From past experience, the Salsa CI pipeline is
> slower and much more flaky than the buildds, so I'm not going to
> spend several hours (and retries) per upload waiting to see if the
> Salsa CI deemed my upload worthy.

Pipeline stages are by definition locally controlled, as the
configuration lives in the git repository it runs for, so a maintainer
can disable some or all of them on your repositories.
>From past experience, the Salsa CI pipeline has been a huge boon,
finding issues that would have otherwise been missed, and ensuring
everything goes under the same degree of testing, regardless of where
it comes from. Not to mention the massive time saver it is: just push
and come back later and see the result, instead of driving things
manually.



Re: [RFC] General Resolution to deploy tag2upload [and 1 more messages]

2024-06-13 Thread Ian Jackson
Simon Richter writes ("Re: [RFC] General Resolution to deploy tag2upload [and 1 
more messages]"):
> On 6/13/24 20:29, Marco d'Itri wrote:
> > Of course: this makes auditing much easier.
> 
> That is a *massive* amount of data though, especially if we're expected 
> to import the entire upstream git history as well and base the packaging 
> branch on top of an upstream commit.

In practice IME the full git history is usually a small integer
multiple of the current codebase size.  archive.d.o already has to
contain many copies of the source code.  And we already want to keep
all uploaded versions indefinitely (that's what snapshot.d.o is).

I think it is possible that there will be a handful of packages where
things are significantly more awkward, which might not be able to
adopt tag2upload.

> We will also need to be prepared for removal requests, so there needs to 
> be a procedure in place for that, people authorized to perform it, and 
> an audit framework for that.

Yes.

The dgit git server is already set up for this, from an infrastructure
point of view.  We have mechanisms that allow an administrator (Sean
or I in this case) to rewind a repository, and also to prevent harmful
git objects from being re-uploaded.

In the 10 years that dgit-repos has existed, this has been necessary
once, due to a bug in dgit itself (that caused corrupted commits that
were accepted by git but rejected by most forges, #850469/#849041).

We don't have a very developed process flow but I think we could
probably make it up without too much trouble.

> We could add some mechanisms, like enforcing that merge commits pulling 
> in a new upstream version will only modify files outside of debian/ in 
> one subtree, and files inside debian/ in the other, but that conflicts 
> with workflows that maintain Debian-specific patches as commits instead 
> of patch files.

This sounds a bit like `git-debrebase`, which is a git workflow tool -
a competitor to gbp and git-dpm.  The src:xen team uses it, for
example.  But I don't think it's suitable for everyone.

git packaging and delta management workflows (and the associated
tools) all have both strengths and weaknesses.  Maintainers are going
to have to continue to decide which set of tradeoffs to choose.

> We have several 90% solutions of mapping Debian packaging onto git, but 
> all of these are incomplete and annoying to use because we disagree with 
> git on what constitutes data, and what constitutes metadata, so the data 
> model does not match reality or requirements, and from a security 
> standpoint that concerns me more than improved forensics.

I think tag2upload brings Debian's model closer to git and to
upstream's.  That's a big part of the point.

Ian.

-- 
Ian JacksonThese opinions are my own.  

Pronouns: they/he.  If I emailed you from @fyvzl.net or @evade.org.uk,
that is a private address which bypasses my fierce spamfilter.



Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Sean Whitton
Hello,

On Thu 13 Jun 2024 at 03:11pm +02, Pierre-Elliott Bécue wrote:

> Sean Whitton  wrote on 13/06/2024 at 14:44:57+0200:
>
>> Hello,
>>
>> On Thu 13 Jun 2024 at 01:05pm +02, Ansgar  wrote:
>>
>>> The statement also reads like the implementation was reviewed by Russ
>>> which as far as I understand isn't the case either? Or do you only plan
>>> to deploy a version once such a review happened?
>>
>> We weren't planning for this to be done, no.
>
> I'm sorry but I have a problem here.
>
> You stated in your first mail that both rra and noodles audited your
> work, and here it seems that audited is potentially a bit more than what
> has been done.
>
> Could you elaborate explicitly on what you mean with "audited"?

That's fair, I'll disambiguate the wording of (2).

Russ and Jonathan thoroughly reviewed the design and its security
properties.  They've looked at bits of the implementation but not in a
completely systematic way.

-- 
Sean Whitton



Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Antoine Beaupré
On 2024-06-13 12:38:36, Ian Jackson wrote:
> Antoine Beaupré writes ("Re: [RFC] General Resolution to deploy tag2upload"):
>> Right now my workflow is basically git-buildpackage + salsa + dput,
>> relunctantly using pristine-tar sometimes.
>
>> I have *tried* to use dgit, but [...]
>
> I think maybe I should make a blog post explaining what dgit is, and
> isn't.  But that's probably rather out-of-scope for this thread.

I think the dgit documentation is actually pretty good, I don't think
that's the issue here.

>> 1. how does this change my gbp/salsa/dput workflow?
>>can i *just* s/dput/dgit/?
>
> By "this" I'm going to take you to mean "tag2upload".
> With tag2upload you don't run dgit.

Well, isn't tag2upload part of dgit? Or at least git-debpush, the binary
package, seems to be part of the dgit source package here... we're also
talking frequently about dgit.debian.org as part of this infrastructure,
so clearly this whole thing is kind of a part of dgit...

I am not sure saying those things are completely separate here is
helpful, it would be more useful to clarify exactly what component we're
adopting and what patterns we need to change if we want to adopt
this. For example, does this respect DEP-14? Which parts?

> You replace gbp/salsa/dput with git-debpush.  git-debpush will push to
> your branch salsa for you, as well as making and pushing the git tag.
> The tag2upload service will take care of the rest.

See that's a different answer than what Sean said, I believe, which is
that you replace `dpkg-buildpackage -S` with git-debpush. :)

> You'll still want to run gbp, etc., as part of your pre-upload
> testing, of course.

Right. But I don't upload the resulting source package, that's thrown
away, basically...

>> Can I just keep doing gbp + salsa and switch the "dput" bit to
>> "dgit" or "tag2upload" without changing anything else? That would be
>> kind of neat, but I'm not sure *why* I would do that in the first
>> place...

[...]

> tag2upload and dgit have many additional safety checks that help avoid
> mistakes.  For example, you can be sure that the git tree you are
> about to upload is precisely what ends up in the archive - so you can
> rely on git diff and never need to run debdiff on source packages.
> It is much harder to accidentally undo an NMU.  etc.

This brings another question to mind. Right now, I understand that some
people use dgit for NMUs, on packages they do not own. Does this
workflow still support the old NMU process where i get a debdiff with an
upload someone makes for me, or if someone opts in to this process, for
an NMU, *I*, as a maintainer, now have to figure out dgit? :)

>> 2. does this scale to the archive?
>> ==
> ...
>> So what's the plan for dealing with the sheer size of the Debian
>> archive, assuming that eventually everything might reasonably be
>> expected to be *both* on dgit and salsa, if I understand the proposal
>> correctly?
>
> It's true that this is a lot of data.  It's going to be comparable in
> size to the archive.  Scalability is a reasonable concern.
>
> There is one singleton service push.dgit.d.o, which is used only by
> uploaders (and the tag2upload robot).  So it shouldn't become
> overloaded.
>
> Non-uploading clients use {browse,git}.dgit.d.o.  Currently that is a
> single host, which is also shared with some other services.  But it is
> a read-only mirror and we could scale up to multiple mirrors.

Are those two different hosts with their own replicas of the git repos?
Because then that means we have *three* replicas (push.dgit.d.o,
browse.dgit.d.o and salsa.d.o) of those repositories...

[...]

>> 3. what does this mean for salsa/jenkins/bts/etc?
>
> Nothing.
>
>> In the long term, what do you actually think we should do about the
>> duplication of tools out there? We are wasting a lot of energy here
>> maintaining two CI systems (Jenkins and GitLab CI), two bug trackers
>> (BTS and GitLab issues), two wiki systems (MoinMoin and GitLab Wikis),
>
> I don't think I have an opinion about that.  (Or at least, maybe I do,
> but it's not relevant.)

I do think it's relevant. Right now, there's a huge tension between "the
old ways" of doing things, and some people (at least me) believe we
should be converging towards a smaller set of standard tools that we all
use. I, for example, believe we should all use debhelper and ditch CDBS,
use Git (and Salsa) for keeping history, use git-buildpackage as a
standard workflow. I don't believe it's productive to go around fighting
to standardize this, or at least I don't know how to get there, but
that's my bias.

You, I suspect, have a bias as well. If you don't state it clearly,
people will (and have, already!) speculate as to what your underlying
intentions are. Sean, for example, has clearly stated he likes Salsa and
wants it to stick around, which probably will comfort people who worry
about this.

I think if you, in particular, would speak your mind about this, it

Re: Security review of tag2upload

2024-06-13 Thread Simon Richter

Hi,

On 6/13/24 22:27, Simon Josefsson wrote:


Generally I reach the same conclusion, although I think there are real
security problems with both the existing and the proposed tag2upload
mechanism that we should all be aware of.  It is acceptable to realize
that we cannot protect against all attacks with reasonable costs.


In that case it is kind of disingenuous to highlight the necessity of 
this change by pointing at the xz-utils scenario.


   Simon



Re: Security review of tag2upload

2024-06-13 Thread Simon Josefsson
Russ Allbery  writes:

> The decision on whether to adopt tag2upload should be made primarily on
> non-security grounds.

Generally I reach the same conclusion, although I think there are real
security problems with both the existing and the proposed tag2upload
mechanism that we should all be aware of.  It is acceptable to realize
that we cannot protect against all attacks with reasonable costs.
That's why we need the ability to transparently audit all steps, to
detect them when they occur.  Reversely: it would be unfortunate to say
no to new functionality because the new functionality don't solve all
possible problems.  That just stalls progress.

> ## Threat model
>
> I evaluated both the existing source package upload architecture and the
> tag2upload architecture against the following threats:
>
> - Someone not in the keyring uploads a malicious source package, possibly
>   via a sponsor.
>   
> - Someone in the keyring (either a Debian Developer or a Debian Maintainer
>   for a package) uploads a malicious source package but makes it appear
>   that the package was uploaded by someone else in the keyring.
>
> - An attacker compromises the system a Debian uploader uses to build
>   source packages and uses that access to inject malicious code into a
>   source package.
>
> - Someone with administrative access to the archive processing machinery
>   (DAK, the archive signing key, or similar infrastructure) uploads a
>   malicious source package.
>   
> - Someone with administrative access to the tag2upload server or its
>   signing key uploads a malicious source package.
>   
> - Someone with administrative access to Salsa uploads a malicious source
>   package.

Having a threat model is great.  I find the notion of "uploads a source
package" is poorly defined here though.

What threat model of those (if any) cover the situation were someone in
the keyring uploads a (benign) source package and something on Debian's
side (e.g., design of tag2upload) enables an attacker to substitute some
part of the intended upload with something malicious?

With tag2upload, I don't think we can reasonable talk about "upload a
source package" any more.  What would you define that to actually mean?
Maybe I'm missing some introductionary documentation here.

> ### Git object collisions
>
> The current Git repository format and wire protocols use SHA-1 hash
> digests (and only SHA-1 hash digests) to identify objects in the Git
> repository. Git uses a SHA-1 hash function that has been
> [hardened against the SHAttered attack on 
> SHA-1](https://github.com/cr-marcstevens/sha1collisiondetection),
> and therefore is probably not vulnerable to known collision attacks.

Can this be substantiated?  Using SHA1CD in Git does not necessarily
mean someone cannot manually create a Git repository with a colliding
git commit somewhere in the history that gets accepted by git, and
allows someone to replace actual file contents.  That may be the case,
but I haven't seen any detailed analysis answering that.

> This analysis is relevant only for SHA-1-based Git repositories. Once
> Salsa supports SHA-256 Git repositories, tag2upload could decline to act
> on any repository that uses SHA-1 hash digests, making this entire section
> moot.

I don't think it will be as simple as that: the git SHA256 transition
documents suggests to me that even signed tags may refer to both SHA256
and SHA1 commits:

https://git-scm.com/docs/hash-function-transition#_signed_tags

Thus tag2upload would need to require 1) SHA256 Git repository support,
AND 2) that git tags refer to a SHA256 commit id, AND 3) any git
submodules used also rely on SHA256 rather than SHA1.

>  Replacing the upstream tree
>
> The attack: Construct a benign and malicious Git tree pair containing only
> the upstream source. Reference the benign tree in a source package and get
> that source package signed by a sponsor to trigger tag2upload processing.
> Race the tag2upload server by deleting the upstream tag and commit ID and
> then pushing the malicious Git repository as a new commit with the same
> commit ID.

I think this is an important and realistic attack vector that we
shouldn't be vulnerable to.

> The upstream tag name is present in the signed tag metadata, but since
> that tag itself is not required to be signed, the attacker can move it at
> will. The upstream tag therefore provides no protection against this
> attack apart from a small detection risk. Authentication of the upstream
> tree comes only from the inclusion of its commit ID in the tag metadata.

Which is SHA1 currently, and thus vulnerable to a collision attack,
which are known to be possible.

> I suspect (but am not certain) that this attack would normally be
> prevented by the Salsa Git service. The benign tree already existed in the
> same repository with the referenced commit ID (presumed to be checked by
> the sponsor during review), and even if references to that object are
> deleted via branch 

Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Luca Boccassi
On Thu, 13 Jun 2024 at 14:49, Ian Jackson
 wrote:
>
> Timo Röhling writes ("Re: [RFC] General Resolution to deploy tag2upload"):
> > Luca Boccassi  [2024-06-13 14:23]:
> > >As far as I understand in the current proposal the trigger is a
> > >webhook running on Salsa after a push - have you considered instead
> > >having the trigger be a stage in the salsa-ci pipeline, that would run
> > >after the previous stages have completed successfully?
> >
> > I hate that idea. From past experience, the Salsa CI pipeline is
> > slower and much more flaky than the buildds, so I'm not going to
> > spend several hours (and retries) per upload waiting to see if the
> > Salsa CI deemed my upload worthy.
>
> I hope Luca wasn't suggesting that Salsa CI as a blocker ought to be
> mandatory.  Like so many things in this space, some people love what
> others hate.

I was not, I wasn't suggesting to make this a hard requirement, as you
say that's more complicated. Merely moving the fire-and-forget webhook
as the last stage of the pipeline, as the default
setting/setup/config/whatever. This is not to provide strong
guarantees, but merely an easy default that encourages a QA pass
first. Then maintainers can override the pipeline config and skip it,
if they don't want it for any reason. If it was the default, I suspect
de-facto the majority of uploads would go through it, and we would
gain in quality, on average (exceptions apply, etc etc).



Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Ian Jackson
Timo Röhling writes ("Re: [RFC] General Resolution to deploy tag2upload"):
> Luca Boccassi  [2024-06-13 14:23]:
> >As far as I understand in the current proposal the trigger is a
> >webhook running on Salsa after a push - have you considered instead
> >having the trigger be a stage in the salsa-ci pipeline, that would run
> >after the previous stages have completed successfully?
>
> I hate that idea. From past experience, the Salsa CI pipeline is 
> slower and much more flaky than the buildds, so I'm not going to 
> spend several hours (and retries) per upload waiting to see if the 
> Salsa CI deemed my upload worthy.

I hope Luca wasn't suggesting that Salsa CI as a blocker ought to be
mandatory.  Like so many things in this space, some people love what
others hate.

Ian.

-- 
Ian JacksonThese opinions are my own.  

Pronouns: they/he.  If I emailed you from @fyvzl.net or @evade.org.uk,
that is a private address which bypasses my fierce spamfilter.



Re: [RFC] General Resolution to deploy tag2upload [and 1 more messages]

2024-06-13 Thread Simon Richter

Hi,

On 6/13/24 20:29, Marco d'Itri wrote:


Do we actually want or need to hoard all the collaboration history?



Of course: this makes auditing much easier.


That is a *massive* amount of data though, especially if we're expected 
to import the entire upstream git history as well and base the packaging 
branch on top of an upstream commit.


We will also need to be prepared for removal requests, so there needs to 
be a procedure in place for that, people authorized to perform it, and 
an audit framework for that.


I don't think any additional auditing of upstream sources will be 
performed because of this either, they will just be pulled in and used 
as-is. We might get additional insights after a breach, perhaps, if 
Github decide to take a compromised repository offline and our copy is 
still accessible.


We could add some mechanisms, like enforcing that merge commits pulling 
in a new upstream version will only modify files outside of debian/ in 
one subtree, and files inside debian/ in the other, but that conflicts 
with workflows that maintain Debian-specific patches as commits instead 
of patch files.


Without such a mechanism, these merge commits would immediately become 
the most obvious place to hide malicious code in a large changeset.


We have several 90% solutions of mapping Debian packaging onto git, but 
all of these are incomplete and annoying to use because we disagree with 
git on what constitutes data, and what constitutes metadata, so the data 
model does not match reality or requirements, and from a security 
standpoint that concerns me more than improved forensics.


   Simon



Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Ian Jackson
Luca Boccassi writes ("Re: [RFC] General Resolution to deploy tag2upload"):
> As far as I understand in the current proposal the trigger is a
> webhook running on Salsa after a push - have you considered instead
> having the trigger be a stage in the salsa-ci pipeline, that would run
> after the previous stages have completed successfully? IE, like we can
> do today with aptly or pages publishing, for example. What runs in the
> pipeline is still under the control of the individual repo
> maintainers, but the default would mean having this additional CI
> step, which I think is what Andreas is hinting at, but solve it on the
> other end of the pipeline - at the beginning, rather than at the end.

I think would be possible in principle.  It would certainly be nice to
be able to say "please upload this but only if the Salsa CI passes".

It is more complicated, though, than simply having the webhook run off
CI jobs instead.  A webhook is supposed just to be a trigger to look
at something, not a definitive API call; conversely, the user ought
not to make a signed tag requesting an unconditonal upload if what
theyt really mean is "upload if CI passes".

So to do this properly the t2u server should somehow separately verify
that the CI has passed.  I think this probably means having a CI job
job which signs a "tests passed on this commit" tag using a key
available to Salsa, and providing the t2u server with *both*
signatures.  (Since we don't want the t2u server making API calls to
salsa!)

I don't propose to implement this right away.

Ian.

-- 
Ian JacksonThese opinions are my own.  

Pronouns: they/he.  If I emailed you from @fyvzl.net or @evade.org.uk,
that is a private address which bypasses my fierce spamfilter.



Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Timo Röhling

Hi,

* Luca Boccassi  [2024-06-13 14:23]:

As far as I understand in the current proposal the trigger is a
webhook running on Salsa after a push - have you considered instead
having the trigger be a stage in the salsa-ci pipeline, that would run
after the previous stages have completed successfully?
I hate that idea. From past experience, the Salsa CI pipeline is 
slower and much more flaky than the buildds, so I'm not going to 
spend several hours (and retries) per upload waiting to see if the 
Salsa CI deemed my upload worthy.



Cheers
Timo

--
⢀⣴⠾⠻⢶⣦⠀   ╭╮
⣾⠁⢠⠒⠀⣿⡁   │ Timo Röhling   │
⢿⡄⠘⠷⠚⠋⠀   │ 9B03 EBB9 8300 DF97 C2B1  23BF CC8C 6BDD 1403 F4CA │
⠈⠳⣄   ╰╯


signature.asc
Description: PGP signature


Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Luca Boccassi
On Thu, 13 Jun 2024 at 12:47, Ian Jackson
 wrote:
>
> Andreas Tille writes ("Re: [RFC] General Resolution to deploy tag2upload"):
> > That means some package build process is done before the source
> > package is forwarded to dak and sends some e-mail back?
>
> Only a source package build.

As far as I understand in the current proposal the trigger is a
webhook running on Salsa after a push - have you considered instead
having the trigger be a stage in the salsa-ci pipeline, that would run
after the previous stages have completed successfully? IE, like we can
do today with aptly or pages publishing, for example. What runs in the
pipeline is still under the control of the individual repo
maintainers, but the default would mean having this additional CI
step, which I think is what Andreas is hinting at, but solve it on the
other end of the pipeline - at the beginning, rather than at the end.



Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Pierre-Elliott Bécue
Sean Whitton  wrote on 13/06/2024 at 14:44:57+0200:

> Hello,
>
> On Thu 13 Jun 2024 at 01:05pm +02, Ansgar  wrote:
>
>> The statement also reads like the implementation was reviewed by Russ
>> which as far as I understand isn't the case either? Or do you only plan
>> to deploy a version once such a review happened?
>
> We weren't planning for this to be done, no.

I'm sorry but I have a problem here.

You stated in your first mail that both rra and noodles audited your
work, and here it seems that audited is potentially a bit more than what
has been done.

Could you elaborate explicitly on what you mean with "audited"?

Bests,
-- 
PEB


signature.asc
Description: PGP signature


Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Sean Whitton
Hello,

On Thu 13 Jun 2024 at 01:05pm +02, Ansgar  wrote:

> The statement also reads like the implementation was reviewed by Russ
> which as far as I understand isn't the case either? Or do you only plan
> to deploy a version once such a review happened?

We weren't planning for this to be done, no.

-- 
Sean Whitton



Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Aigars Mahinovs
On Thu, 13 Jun 2024 at 12:02, Ian Jackson
 wrote:
>
> Sean Whitton writes ("Re: [RFC] General Resolution to deploy tag2upload"):
> >[Joerg Jaspert wrote:]
> >> Actually, we can set acls on fingerprints and then that key wont be able
> >> to upload anymore. That is not something recorded in the keyrings or the
> >> DM list. Obviously that is not something used often (really really
> >> seldom), it is more for "this key is compromised badly, please turn off
> >> anything with it *NOW*" situations, which it's what Helmut meant with the
> >> urgent cases.
> [and]
> >> *Really* seldom. I would have to dig and see when, especially for the
> >> timing thing with keyring team.
> >
> > Thanks.  Then possibly it is sufficient for ftpmaster just to disable
> > tag2upload's whole key until the keyring update is pushed.
>
> I'm not sure this is a sufficient answer.  We don't want uploads by
> revoked keys to appear on *.dgit.d.o either.

Correct me if I am wrong, but if we are looking at dgit.d.o as
snapshot and audit log
of the tag2upload service, would it not be beneficial for the auditing
and back-tracing
process to actually keep the code that someone tried to upload via
tag2upload even
if their key is revoked, expired or signature is invalid?

Maybe re-tagged to something like invalid_$original_tag but still kept
around for people
to inspect if needed.

-- 
Best regards,
Aigars Mahinovsmailto:aigar...@debian.org
  #--#
 | .''`.Debian GNU/Linux (http://www.debian.org)|
 | : :' :   Latvian Open Source Assoc. (http://www.laka.lv) |
 | `. `'Linux Administration and Free Software Consulting   |
 |   `- (http://www.aiteki.com) |
 #--#



Re: [RFC] General Resolution to deploy tag2upload [and 1 more messages]

2024-06-13 Thread Simon Richter

Hi Ian,

On 6/13/24 18:57, Ian Jackson wrote:


(Because git inherently has history, the dgit-repos server can
perform both functions at once.)


Do we actually want or need to hoard all the collaboration history?

   Simon



Re: source tarballs vs. source from git

2024-06-13 Thread Sean Whitton
Hello,

On Wed 12 Jun 2024 at 03:45pm +01, Simon McVittie wrote:

> As far as I know, git-archive (and therefore git-deborig) doesn't guarantee
> that repeatedly archiving the same git tree produces the same tarball,
> which could be awkward for the ftp archive's tarball-integrity-based rules;
> but hopefully tag2upload would insulate individual developers from that by
> always "doing the right thing" for the current contents of the archive?

Just to note quickly that yes, it does.

-- 
Sean Whitton



Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Sean Whitton
Hello,

On Thu 13 Jun 2024 at 01:05pm +02, Ansgar  wrote:

> On Thu, 2024-06-13 at 16:58 +0800, Sean Whitton wrote:
>> On Wed 12 Jun 2024 at 11:14am +02, Ansgar  wrote:
>> >
>> > As far as I understand, the GR is about pushing the design and
>> > implementation as is, without any changes. It very explicitly says
>> > so.
>>
>> It does not say this.
>
> Quote:
>
> ---
> 1. tag2upload, in the form designed and implemented by Sean Whitton and
>Ian Jackson, and reviewed by Jonathan McDowell and Russ Allbery,
>should be deployed to official Debian infrastructure.
> ---

It also says "The design is not expected to change significantly, but we
may tweak details in response to feedback from d-vote, and while
finishing the server-side deployment implementation, in consultation
with DSA."

The idea of the GR text is to give the named designers a certain amount
of latitude to make appropriate changes.

Thanks for the feedback here.

-- 
Sean Whitton



Re: [RFC] General Resolution to deploy tag2upload [and 1 more messages]

2024-06-13 Thread Ian Jackson
Jonathan Carter writes ("Re: [RFC] General Resolution to deploy tag2upload"):
> On 2024/06/12 10:21, Luca Boccassi wrote:
> > Having a separate namespace with strong ACLs seems exactly what you
> > want, even if it duplicates the individual repositories (the backend
> > git store deduplicates it anyway, so in practice it should be quite
> > cheap). Having an entire separate git forge that competes with Salsa
> > seems orthogonal to this, and counterproductive for the project.
> 
> I found the overview of tag2upload from Ian at MDC Campbridge quite 
> useful (and the workflow diagrams that he presented). From my 
> understanding (and I may still have the wrong end of a stick here), the 
> additional git store used for tag2upload becomes a replacement for 
> source packages that happens to use git. So from my understanding, it's 
> more a competitor to source packages rather than to salsa.

Yes.  Russ's comparisons to archive.d.o and snapshot.d.o are helpful.
(Because git inherently has history, the dgit-repos server can
perform both functions at once.)

Scott Kitterman writes ("Re: [RFC] General Resolution to deploy tag2upload"):
> I think it is more accurate to say that they are mirrors.  They both contain 
> details of current and historical packages.  The difference is that snapshot 
> is downstream of the archive, while these putative the tag2upload 
> repositories are upstream.
> 
> It's it being upstream of the primary archive that makes it far more security 
> sensitive.

The dgit-repos server (*.dgit.d.o) is not upstream of
archive.debian.org.

The tag2upload conversion server is upstream of both, and is indeed
very security sensitive.

Ian.

-- 
Ian JacksonThese opinions are my own.  

Pronouns: they/he.  If I emailed you from @fyvzl.net or @evade.org.uk,
that is a private address which bypasses my fierce spamfilter.



Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Ian Jackson
Andreas Tille writes ("Re: [RFC] General Resolution to deploy tag2upload"):
> That means some package build process is done before the source
> package is forwarded to dak and sends some e-mail back?

Only a source package build.

> I know we have this.  My point is that tag2upload users might forget
> to use it before using tag2upload service.  I simply want to make
> sure that tag2upload is not another way to upload anything that does
> not build on buildservices.

I'm afriad that tag2upload is precisely another way to do that.

That's because that's how uploading works now, and tag2upload is
another way to make an upload.  Uploads must be source-only nowadays
(in most cases).  So there is, by design, nothing in the existing
setup that ensures that a maintainer built binaries.  (I get the
feeling that you're not happy with this situation, but that's how
Debian is now, and I think it's a jolly good thing.)

You might argue that tag2upload makes this worse because it makes it
easier to perform uploads.  It certainly *does* make it easier to
perform uploads.  That's a big part of the point.

I think this can only be a *downside* if you think it is a good
thing that uploading is difficult.

Ian.

-- 
Ian JacksonThese opinions are my own.  

Pronouns: they/he.  If I emailed you from @fyvzl.net or @evade.org.uk,
that is a private address which bypasses my fierce spamfilter.



Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Ian Jackson
Antoine Beaupré writes ("Re: [RFC] General Resolution to deploy tag2upload"):
> Right now my workflow is basically git-buildpackage + salsa + dput,
> relunctantly using pristine-tar sometimes.

> I have *tried* to use dgit, but [...]

I think maybe I should make a blog post explaining what dgit is, and
isn't.  But that's probably rather out-of-scope for this thread.

> 1. how does this change my gbp/salsa/dput workflow?
>can i *just* s/dput/dgit/?

By "this" I'm going to take you to mean "tag2upload".
With tag2upload you don't run dgit.

You replace gbp/salsa/dput with git-debpush.  git-debpush will push to
your branch salsa for you, as well as making and pushing the git tag.
The tag2upload service will take care of the rest.

You'll still want to run gbp, etc., as part of your pre-upload
testing, of course.

> Can I just keep doing gbp + salsa and switch the "dput" bit to
> "dgit" or "tag2upload" without changing anything else? That would be
> kind of neat, but I'm not sure *why* I would do that in the first
> place...

There are two good reasons why you might want to adopt tag2upload,
(and they mostly apply to dgit push).

Firstly, tag2upload is much simpler, more convenient and faster.
(dgit is also simpler to use and more convenient, but not quite so
simple as tag2upload, and it isn't faster than old-school uploads.)

Secondly, tag2upload is more reliable, more traceable, and provides a
better result for users.  (These advantages apply to dgit too.)
NB "reliable" here often means "more likely to stop and report a
problem, than blindly carry on and perhaps do a wrong thing".

tag2upload and dgit have many additional safety checks that help avoid
mistakes.  For example, you can be sure that the git tree you are
about to upload is precisely what ends up in the archive - so you can
rely on git diff and never need to run debdiff on source packages.
It is much harder to accidentally undo an NMU.  etc.

> 2. does this scale to the archive?
> ==
...
> So what's the plan for dealing with the sheer size of the Debian
> archive, assuming that eventually everything might reasonably be
> expected to be *both* on dgit and salsa, if I understand the proposal
> correctly?

It's true that this is a lot of data.  It's going to be comparable in
size to the archive.  Scalability is a reasonable concern.

There is one singleton service push.dgit.d.o, which is used only by
uploaders (and the tag2upload robot).  So it shouldn't become
overloaded.

Non-uploading clients use {browse,git}.dgit.d.o.  Currently that is a
single host, which is also shared with some other services.  But it is
a read-only mirror and we could scale up to multiple mirrors.

> (Well, technically, the proposal says "this is opt-in, entirely
> optional", but Ian at least has explicitly stated he expects people to
> enthusiastically start to use dgit massively in the future, so even if
> that's not actually part of the proposal, we should take that scenario
> into account.)

YM enthusiastically start to use tag2upload.  But, yes.

> 3. what does this mean for salsa/jenkins/bts/etc?

Nothing.

> In the long term, what do you actually think we should do about the
> duplication of tools out there? We are wasting a lot of energy here
> maintaining two CI systems (Jenkins and GitLab CI), two bug trackers
> (BTS and GitLab issues), two wiki systems (MoinMoin and GitLab Wikis),

I don't think I have an opinion about that.  (Or at least, maybe I do,
but it's not relevant.)

tag2upload is not a competitor to any of the things you list.

In the long term, tag2upload depends on there being one or more things
that are a enough like git forges that they can call webhooks and
serve up git tags.  Right now that's Salsa.  If Debian wants to
replace gitlab with some other forge that's not something that
tag2upload has much of an opinion about.

Ultimately, *.dgit.d.o is in some sense a competitor to
archive.debian.org, but I don't see us abolishing archive.d.o.
Instead, tag2upload is getting us further towards on dual running,
where we accept either source packages or git trees, and publish both.

> two (or more?) VCS hosting systems (dgit and GitLab repos)?

dgit-repos is complementary to gitlab.  (The relationship to salsa has
been discussed extensively elsewhere in this megathread.)

> I understand the proposal doesn't directly say "oh yeah, we're actually
> thinking we should ditch salsa and replace it with all those nice little
> small components", but it is certainly taking a stand that Salsa is not
> good enough to provide the level of security that is required to upload
> packages in Debian, and saying that is saying a lot because I suspect we
> are *actually* trusting Salsa and GitLab with our code much more than we
> would like to admit...

To be completely clear: tag2upload is not a replacement for Salsa, and
it cannot be such a replacement.  In the planned deployment it
*depends on* Salsa.

> Anyways, I hope I'm not 

Re: [RFC] General Resolution to deploy tag2upload [and 1 more messages]

2024-06-13 Thread Marco d'Itri
s...@debian.org wrote:

>Do we actually want or need to hoard all the collaboration history?
Of course: this makes auditing much easier.

-- 
ciao,
Marco



Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Ansgar 
On Thu, 2024-06-13 at 16:58 +0800, Sean Whitton wrote:
> On Wed 12 Jun 2024 at 11:14am +02, Ansgar  wrote:
> > 
> > As far as I understand, the GR is about pushing the design and
> > implementation as is, without any changes. It very explicitly says
> > so.
> 
> It does not say this.

Quote:

---
1. tag2upload, in the form designed and implemented by Sean Whitton and
   Ian Jackson, and reviewed by Jonathan McDowell and Russ Allbery,
   should be deployed to official Debian infrastructure.
---

I understand that as the form as designed and implemented (now), not
only a future version with possible changes.

If you only want to deploy a modified version, then the text should
probably be amended.

The statement also reads like the implementation was reviewed by Russ
which as far as I understand isn't the case either? Or do you only plan
to deploy a version once such a review happened?

Ansgar



Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Ian Jackson
Aigars Mahinovs writes ("Re: [RFC] General Resolution to deploy tag2upload"):
> Correct me if I am wrong, but if we are looking at dgit.d.o as
> snapshot and audit log of the tag2upload service, would it not be
> beneficial for the auditing and back-tracing process to actually
> keep the code that someone tried to upload via tag2upload even if
> their key is revoked, expired or signature is invalid?

This is an interesting idea.  There are some potential trouble
vectors, there, though (eg, malicious people could cause the sytem to
store Bad Stuff).  And also it's not entirely straightforward to
implement because the system would need suitable access to whatever
the bad-uploads archive server is.

With the current design, it might be possible to retrieve the tag from
Salsa, perhaps with admin assistance, at least for a while.  That's
not great, but I'm hoping the need for this will be quite rare.

I think I would like to treat your suggestion as a possibility for
future enhancement, rather than something we'd have from day one.

Ian.

-- 
Ian JacksonThese opinions are my own.  

Pronouns: they/he.  If I emailed you from @fyvzl.net or @evade.org.uk,
that is a private address which bypasses my fierce spamfilter.



Re: [RFC] General Resolution to deploy tag2upload [and 1 more messages]

2024-06-13 Thread Ian Jackson
Simon Richter writes ("Re: [RFC] General Resolution to deploy tag2upload [and 1 
more messages]"):
> On 6/13/24 18:57, Ian Jackson wrote:
> > (Because git inherently has history, the dgit-repos server can
> > perform both functions at once.)
> 
> Do we actually want or need to hoard all the collaboration history?

In short, yes, we certainly want to and we may need to.

Nowadays for software maintained in git, the git *history* is usually
an important part of the source code - it's part of the "preferred
form for modification" as the GPL has it.

Ian.

-- 
Ian JacksonThese opinions are my own.  

Pronouns: they/he.  If I emailed you from @fyvzl.net or @evade.org.uk,
that is a private address which bypasses my fierce spamfilter.



Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Ian Jackson
Timo Röhling writes ("Re: [RFC] General Resolution to deploy tag2upload"):
> Considering that tag2upload is supposed to become a critical 
> component of our infrastructure, I am missing (or may have 
> overlooked) some information on how the deployment is going to be 
> maintained.
> 
> I assume that you will continue to work on the code itself, but who 
> is going to be responsible for keeping the tag2upload service 
> operational? Are you going to manage the deployment as well, has DSA 
> agreed to do it, or do you have an altogether different arrangement 
> in mind?
> 
> Again, thank you your for work on this!

I'm expecting that Sean and I will become the service owner, and that
DSA will manage the VMs.

Currently the *.dgit.d.o git servers (two hosts) are managed that way.
I've had a good working relationship with DSA there.  The t2u
conversion service is rather more complex but I don't foresee
difficulties.  Note, though, that I haven't actually had any specific
conversations with DSA bout these kind of details, at this stage.

Obviously, as with everything in Debian, more help would be very
welcome.  If anyone would like to join in please just let us know!

Ian.

-- 
Ian JacksonThese opinions are my own.  

Pronouns: they/he.  If I emailed you from @fyvzl.net or @evade.org.uk,
that is a private address which bypasses my fierce spamfilter.



Re: [RFC] General Resolution to deploy tag2upload [and 2 more messages]

2024-06-13 Thread Ian Jackson
Scott Kitterman writes ("Re: [RFC] General Resolution to deploy tag2upload"):
> If I am understanding you correctly, tag2upload is only relevant to the XZ 
> Utils type attack if the maintainer uses the upstream git rather than the 
> upstream provided tarball as the basis for their Debian work.  Is that right?

Yes.  It is precisely using the upstream git, rather than the upstream
tarball, that eliminates the gap through which the exploit activation
was smuggled in this case.

(Whether the maintainer could uwe the upstream tarball as the
.orig.tar.gz, while using upstream git as the basis for the package
contents, is a complicated question.)

> If so, it seems to me that is entirely tangential to this proposed GR.

No, because it is not sufficient to base the maintainer git repository
on the upstream git.  It is also necessary that something checks that
those files in the .orig.tar.gz which aren't patched in
debian/patches/ correspond precisely to the git tree the maintainer is
working with.

This check is done by `dgit push-source`, and by tag2upload.
But it is often not done by other workflows.  (Because there are so
many workflows, it is difficult to make fully general statements.)

Simon McVittie writes ("Re: [RFC] General Resolution to deploy tag2upload"):
> Is your position here that if your upstream releases source tarballs
> that intentionally differ from what's in git (notably this is true
> for Autotools `make dist`), then any Good™ maintainer must generate
> their own .orig.tar.* from upstream git and use those in the upload,
> disregarding upstream's source tarball entirely?

I don't think I would state that as a general rule.  I would rather
say that if upstream provides signed git tags, it is *usually* better
to only use upstream git, and ignore the upstream source tarball.

> I'd prefer not to be in a situation where whatever a maintainer does,
> some segment of the project will consider them to be failing to meet
> the project's basic expectations - that seems like a recipe for burnout.

Debian is a very large and old project now and we have a very wide
range of opinions and workflows.  And our mechanisms for resolving
disagreements don't work well.  I'm think that the problem you lament
already exists.

One non-goal of the tag2upload project is to try to reconcile all
this.  Rather, we want to provide a new option, that is convenient,
makes it easy to follow good practices, and will be welcomed and can
be adopted widely.

> In the projects where I'm an upstream maintainer, I *am* trying to move
> towards the official source release being equivalent to a `git archive`
> (including replacing Autotools with Meson, replacing submodules with
> subtrees, etc.), but I don't have the resources or social capital to do
> that instantaneously, even in the few projects where I have influence.

Yes.

Simon McVittie writes ("Re: source tarballs vs. source from git (was: 
tag2upload)"):
> I think the claim here might be that Debian should stop dealing with
> upstream source tarball releases, and instead have the packaging be
> branched from upstream git?

That is how I usually prefer to work, personally.

My claim here, specifically, is that working this way would have made
the xz attack harder.

> As a concrete example, for bubblewrap_0.9.0 (a convenient example
> of a relatively small package), that would mean that instead
> of having our packaged version of bubblewrap be based on the
> bubblewrap-0.9.0.tar.xz with sha256 c6347eac... which can be downloaded
> from https://github.com/containers/bubblewrap/releases/tag/v0.9.0, our
> packaged version of bubblewrap would be based on the tree that forms part
> of the tagged commit 8e51677a... in upstream git.

Yes, precisely.

> If we did that for xz-utils, then the xz-utils attacker would have
> had to include the glue code to activate their malicious payload in
> the upstream git history, and not just the official tarball release -
> which would hopefully have made it more likely that it would have been
> discovered before we integrated the malicious version.

Exactly.

> I think that's going to be a harder sell for some packages than for
> others.

Yes.  That's why we're not saying this way of working is (or should
be) mandatory.

> As far as I know, git-archive (and therefore git-deborig) doesn't guarantee
> that repeatedly archiving the same git tree produces the same tarball,
> which could be awkward for the ftp archive's tarball-integrity-based rules;
> but hopefully tag2upload would insulate individual developers from that by
> always "doing the right thing" for the current contents of the archive?

Yes, it does.  Specifically, it won't make a new orig if the archive
already has one.

Ian.

-- 
Ian JacksonThese opinions are my own.  

Pronouns: they/he.  If I emailed you from @fyvzl.net or @evade.org.uk,
that is a private address which bypasses my fierce spamfilter.



Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Ian Jackson
Bastian Blank writes ("Re: [RFC] General Resolution to deploy tag2upload"):
> On Wed, Jun 12, 2024 at 04:23:29PM +0100, Simon McVittie wrote:
> > On Wed, 12 Jun 2024 at 15:20:45 +0100, Ian Jackson wrote:
> > > tag2upload, like dgit, ensures and insists that the git tree you are
> > > uploading corresponds precisely [1] to the generated source package.
> > > 
> > > If you base your Debian git maintainer branch on the upstream git (as
> > > you should) and there is a discrepancy between the contents of the
> > > upstream git branch, and the .orig.tar.gz you're using, the upload
> > > will fail.
> 
> How would it fail?

git-debpuah has code in it to try to check that your upload is likely
to succeed.  It should detect this situation, and report an error
message, before it even makes the tag for you.

You can override that check, since it's just there for your
convenience.  Or you could make and push the tag2upload
`please-upload` tag by hand.  If you do so, the discrepancy will be
detected by the tag2upload conversion system during source package
construction.  You'd receive an error report by email.

> This actually means we need to get rid of orig.tar completely.
> Something that does not exist can't differ.

.orig.tar.gz is still useful as a space optimisation for incremental
updates to source packages.

Ian.

-- 
Ian JacksonThese opinions are my own.  

Pronouns: they/he.  If I emailed you from @fyvzl.net or @evade.org.uk,
that is a private address which bypasses my fierce spamfilter.



Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Ian Jackson
Sean Whitton writes ("Re: [RFC] General Resolution to deploy tag2upload"):
>[Joerg Jaspert wrote:]
>> Actually, we can set acls on fingerprints and then that key wont be able
>> to upload anymore. That is not something recorded in the keyrings or the
>> DM list. Obviously that is not something used often (really really
>> seldom), it is more for "this key is compromised badly, please turn off
>> anything with it *NOW*" situations, which it's what Helmut meant with the
>> urgent cases.
[and]
>> *Really* seldom. I would have to dig and see when, especially for the
>> timing thing with keyring team.
> 
> Thanks.  Then possibly it is sufficient for ftpmaster just to disable
> tag2upload's whole key until the keyring update is pushed.

I'm not sure this is a sufficient answer.  We don't want uploads by
revoked keys to appear on *.dgit.d.o either.

Joerg, is there some way that this fingerprint block information could
be made available in a more timely manner?  Ideally we would update
push.dgit.d.o to use this information, regardless of tag2upload.
(And the t2u conversion system should use it too.)

I think maybe we should take this to a different venue, than this
thread on -vote.  How about a bug against ftp.d.o and/or
dgit-infrastructure ?

Thanks,
Ian.

-- 
Ian JacksonThese opinions are my own.  

Pronouns: they/he.  If I emailed you from @fyvzl.net or @evade.org.uk,
that is a private address which bypasses my fierce spamfilter.



Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Andreas Tille
Hi Sean,

Am Thu, Jun 13, 2024 at 04:53:56PM +0800 schrieb Sean Whitton:
> > thanks to all for this GR.  I like tag2upload in principle.  The only
> > thing I'm a bit scared about is that it simplifies uploading something
> > that was never built before on the local machine.  Sure, this can be
> > done with source-only uploads as well, but tag2upload makes it even
> > easier.
> 
> I don't believe it makes any difference.  We already have 'dgit
> push-source' which will do a source-only upload with a single command
> invocation.  And if 'dgit push-source' errors out, that's equivalent to
> tag2upload failing to upload and e-mailing you.

That means some package build process is done before the source
package is forwarded to dak and sends some e-mail back?
 
> No, there is nothing additional being done.
> 
> Now we have salsa CI, though, we have various good options for automated
> pre-upload testing.

I know we have this.  My point is that tag2upload users might forget
to use it before using tag2upload service.  I simply want to make
sure that tag2upload is not another way to upload anything that does
not build on buildservices.

Kind regards
   Andreas. 

-- 
https://fam-tille.de



Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Timo Röhling

Hello Sean,

first of all, I have been a very happy user of dgit for some time 
now, and I want to use the opportunity to thank Ian and you for your 
excellent work. I consider dgit to be one of two major improvements 
to my packaging workflow (the other being sbuild with unshare 
backend), and I have no doubt that tag2upload will be just as 
reliable and useful.


* Sean Whitton  [2024-06-12 06:25]:

=
BEGIN FORMAL RESOLUTION TEXT

tag2upload allows DDs and DMs to upload simply by using the
git-debpush(1) script to push a signed git tag.

1. tag2upload, in the form designed and implemented by Sean Whitton and
  Ian Jackson, and reviewed by Jonathan McDowell and Russ Allbery,
  should be deployed to official Debian infrastructure.
Considering that tag2upload is supposed to become a critical 
component of our infrastructure, I am missing (or may have 
overlooked) some information on how the deployment is going to be 
maintained.


I assume that you will continue to work on the code itself, but who 
is going to be responsible for keeping the tag2upload service 
operational? Are you going to manage the deployment as well, has DSA 
agreed to do it, or do you have an altogether different arrangement 
in mind?


Again, thank you your for work on this!

Cheers
Timo

--
⢀⣴⠾⠻⢶⣦⠀   ╭╮
⣾⠁⢠⠒⠀⣿⡁   │ Timo Röhling   │
⢿⡄⠘⠷⠚⠋⠀   │ 9B03 EBB9 8300 DF97 C2B1  23BF CC8C 6BDD 1403 F4CA │
⠈⠳⣄   ╰╯


signature.asc
Description: PGP signature


Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Aigars Mahinovs
On Wed, 12 Jun 2024 at 16:20, Jonas Smedegaard  wrote:
> To answer your convoluted question, I am suggesting that Salsa and
> tag2upload has very different needs (multi-user write versus multi-user
> append-only, drastically simplified), and consequently to not argue that
> reuse of Salsa for hosting tag2upload is a security benefit.

IMHO this is an interesting point that can be a real and useful
feature of the tag2upload system.

Think of it as a source version of snapshots.debian.org - if
tag2upload always saves the tagged state
of the repository to a separate append-only git server whenever it
processes a signed tag, that
would provide a clear archival backup of the exact state of software
that was processed for upload.

It does not matter where tag2upload gets the initial tags from - it
could be Salsa, it could be Github,
it could be a developers self-hosted git server that is added to some
tag2upload config file for polling,
like Plane Debian works. tag2upload could pull from a bunch of git
sources. The config of those
repos does not matter anymore because tag2upload takes care of
signature verification and of archiving.

And where exactly tag2upload keeps it archive does not really matter,
as long as it is an append-only
git server (at least for the repos that tag2upload writes to, which
can be separate from actual development repos).

With that kind of setup, you could not only (like today) go to
snapshots.debian.org to get the exact binary of the
uploaded Debian package with its real state at any particular day in
the past, but also go to the archive git
server of tag2upload and for any processed tag check out the exact git
state that was processed, regardless
of anything that was later done to the original development
repo/server. Even if that server goes down, the archive
will remain.

-- 
Best regards,
Aigars Mahinovsmailto:aigar...@debian.org
  #--#
 | .''`.Debian GNU/Linux (http://www.debian.org)|
 | : :' :   Latvian Open Source Assoc. (http://www.laka.lv) |
 | `. `'Linux Administration and Free Software Consulting   |
 |   `- (http://www.aiteki.com) |
 #--#



Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Sean Whitton
Hello,

On Thu 13 Jun 2024 at 10:33am +02, Andreas Tille wrote:

> thanks to all for this GR.  I like tag2upload in principle.  The only
> thing I'm a bit scared about is that it simplifies uploading something
> that was never built before on the local machine.  Sure, this can be
> done with source-only uploads as well, but tag2upload makes it even
> easier.

I don't believe it makes any difference.  We already have 'dgit
push-source' which will do a source-only upload with a single command
invocation.  And if 'dgit push-source' errors out, that's equivalent to
tag2upload failing to upload and e-mailing you.

> Maybe I missed it in the long thread and I need to admit that I have
> not read the docs, thus the explicit question here: Is the package
> undergoing some CI test (maybe not only building but also autopkgtest
> which I'm doing locally for any package I'm uploading) before it is
> forwarded to dak?

No, there is nothing additional being done.

Now we have salsa CI, though, we have various good options for automated
pre-upload testing.

-- 
Sean Whitton



Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Sean Whitton
Hello,

On Wed 12 Jun 2024 at 04:23pm +01, Simon McVittie wrote:

> On Wed, 12 Jun 2024 at 15:20:45 +0100, Ian Jackson wrote:
>> tag2upload, like dgit, ensures and insists that the git tree you are
>> uploading corresponds precisely [1] to the generated source package.
>>
>> If you base your Debian git maintainer branch on the upstream git (as
>> you should) and there is a discrepancy between the contents of the
>> upstream git branch, and the .orig.tar.gz you're using, the upload
>> will fail.
>
> Is your position here that if your upstream releases source tarballs
> that intentionally differ from what's in git (notably this is true
> for Autotools `make dist`), then any Good™ maintainer must generate
> their own .orig.tar.* from upstream git and use those in the upload,
> disregarding upstream's source tarball entirely?
>
> That approach has many advantages, but it flatly contradicts what devref
> claims a Good™ maintainer would do, which is to always use the pristine
> source tarball as released by upstream (unless it's non-free) - which
> implies that if they're using dgit, then the upstream tree must match
> an import of the tarball.

dev-ref is out-of-date here, I think.  There is no longer a conesnsus we
should be doing that.

-- 
Sean Whitton



Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Sean Whitton
Hello,

On Wed 12 Jun 2024 at 09:42am -07, Russ Allbery wrote:

> Bastian Blank  writes:
>
>> If we need a design, then we can easily avoid the problem points.  There
>> is a working counter proposal open:
>
>> https://bblank.thinkmo.de/introducing-uploads-debian-git.html
>
> It requires a sufficiently reproducible build for source packages.
> Right now it is only known to work with the special 3.0 (gitarchive)
> source format, but even that requires the latest version of this
> format. No idea if it is possible to use others, like 3.0 (quilt) for
> this purpose.
>
> This sounds like a major blocker to me.  tag2upload works with the
> existing representations of Debian packages in Git and with the existing
> supported source package formats.

Yes.  A proposal that has not yet engaged with the complexities of
3.0 (quilt) is not one in which we can yet have any confidence.
As we discovered building dgit, it's substantially more complex than one
realises.

-- 
Sean Whitton



Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Sean Whitton
Hello,

On Thu 13 Jun 2024 at 08:23am +02, Thomas Goirand wrote:

> One thing I really dislike, is having a single gpg key to upoload them all. I
> very much preferred the design that Didier explained during Debconf Kosovo,
> where the .changes signature is uploaded together with the tagged commit.
>
> Your thoughts?
>
> Cheers,
>
> Thomas Goirand (zigo)
>
> P.S: The thread is huge, I have no time to read it all, sorry if someone else
> also raised the same concern.

I'm not sure about the characterisation that it's one key to upload them
all.  tag2upload will be an official service, no less so than ftp-master
-- you could as well say that the current archive signing key is one key
to release them all.

This message from Ian argues against adding things like .changes files:
.

Please excuse me if this does not address exactly Didier's design, with
which I am not familiar.

-- 
Sean Whitton



Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Pierre-Elliott Bécue
Hi,

Luca Boccassi  wrote on 12/06/2024 at 13:15:47+0200:

> On Wed, 12 Jun 2024 at 12:03, Jonas Smedegaard  wrote:
>>
>> Quoting Luca Boccassi (2024-06-12 12:28:21)
>> > On Wed, 12 Jun 2024 at 09:35, Jonas Smedegaard  wrote:
>> > >
>> > > Quoting Luca Boccassi (2024-06-12 10:21:40)
>> > > > On Wed, 12 Jun 2024 at 02:31, Russ Allbery  wrote:
>> > > > >
>> > > > > Luca Boccassi  writes:
>> > > > >
>> > > > > > And on the implementation details, I really do not like the idea of
>> > > > > > having a competing git forge with Salsa. This dgit server seems to 
>> > > > > > just
>> > > > > > be a ye olde git-web interface.
>> > > > >
>> > > > > Does it support gitweb?  I thought it only supported regular Git
>> > > > > operations, but I could be mistaken.
>> > > >
>> > > > I might be wrong, but this is what this looks like to me (it was
>> > > > linked to me on IRC yesterday, wasn't aware of it before):
>> > > >
>> > > > https://browse.dgit.debian.org/
>> > > >
>> > > > > > If this goes forward, in my opinion it should exclusively use Salsa
>> > > > > > as the git server, to avoid duplicating infrastructure.
>> > > > >
>> > > > > I think you want the Git archive to be entirely separate from Salsa
>> > > > > so that it's a reliable source of tracing information.  You don't
>> > > > > want to support force pushes, for example; the whole point is that it
>> > > > > should be append-only, which would be a controversial choice for
>> > > > > Salsa but which is fine for the archives of the uploaded packages.  I
>> > > > > would also want a much smaller attack surface for that type of record
>> > > > > than than GitLab.  GitLab is designed as a place to do interactive
>> > > > > work, not to keep a reliable permanent record.
>> > > >
>> > > > The git repositories, sure. The git forge? I don't see why. You can
>> > > > have these repositories in a separate namespace, which sets strong
>> > > > branch and tag protection rules to achieve what you describe. As far
>> > > > as I am aware, this is possible to do in Salsa already, it doesn't
>> > > > have to be a per-forge rule, it can be per-namespace, I think this is
>> > > > possible to achieve in Gitlab. I have not used tag protection rules
>> > > > (on gitlab, I used them on github though), but I do regularly use
>> > > > branch protection rules on my Salsa repositories.
>> > > >
>> > > > To be clear, I am exclusively talking about the git forge, as in
>> > > > salsa.debian.org, not the git repositories as they might exist on
>> > > > Salsa under the debian/ namespace or any other namespace.
>> > > >
>> > > > Having a separate namespace with strong ACLs seems exactly what you
>> > > > want, even if it duplicates the individual repositories (the backend
>> > > > git store deduplicates it anyway, so in practice it should be quite
>> > > > cheap). Having an entire separate git forge that competes with Salsa
>> > > > seems orthogonal to this, and counterproductive for the project.
>> > >
>> > > I fail to recognize how strong ACLs achieves exactly the same separate
>> > > storage on a separate host.  Especially when the purpose is to minimize
>> > > attack vectors.
>> >
>> > As per the security review just shared, admin access to Salsa allows
>> > to push commits anyway which would get uploaded just the same, and
>> > again as per security review, this case benefits from centralizing:
>> > one host to maintain, and one set of admins to trust, is better than
>> > two. Especially as Salsa is Gitlab, which is maintained upstream and
>> > benefits from the many-eyes-and-many-users situation, while a
>> > completely custom local git forge reimplementation, other than
>> > inevitably suffering from bitrot at some point in the future, like all
>> > custom infrastructure, will have the disadvantage that nobody else
>> > uses it. This is the reason Alioth is gone, and it's a very good
>> > reason.
>>
>> So your argument is that that strong ACLs achieve exactly the same as
>> separate storage on a separate host, because separate storage on a
>> separate host inevitably leads to bitrot and lack of eyeballs.
>>
>> I rest my case.
>
> No, my argument is that append-only can (as far as I can tell) be
> achieved on Salsa too, it doesn't seem to necessitate a bespoke forge.
> The centralizing argument is not mine, it's from the security review
> that was published this morning:
>
> "My security recommendation in this case is therefore to centralize
> the risk as much as possible, moving it off of individual uploader
> systems with unknown security profiles and onto a central system that
> can be analyzed and iteratively improved."
>
> https://lists.debian.org/debian-vote/2024/06/msg4.html
>
>> > > > > That Git archive is not parallel to or competitive with Salsa and 
>> > > > > doesn't
>> > > > > provide most of the functionality that Salsa does.  It has a 
>> > > > > different
>> > > > > purpose.
>> > > >
>> > > > I disagree strongly. As we have seen in the recent Salsa thread on
>> > 

Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Sean Whitton
Hello,

On Wed 12 Jun 2024 at 11:14am +02, Ansgar  wrote:

> Hi,
>
> On Wed, 2024-06-12 at 08:59 +0200, Holger Levsen wrote:
>> Am Tue, Jun 11, 2024 at 10:27:56PM -0700 schrieb Russ Allbery:
>> > > As I said several times before: the implementation has known
>> > > security
>> > > bugs (unless you fixed them). But I guess this is going to get
>> > > ignored
>> > > again anyway...
>> > Could you describe what known security vulnerabilities you believe
>> > exist,
>>
>> does it matter if this GR is about a design? currently the RFC is not
>> to vote about an implementation... :/
>
> As far as I understand, the GR is about pushing the design and
> implementation as is, without any changes. It very explicitly says so.

It does not say this.

-- 
Sean Whitton



Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Sean Whitton
Hello,

On Thu 13 Jun 2024 at 12:08am +02, Joerg Jaspert wrote:

> On 17258 March 1977, Sean Whitton wrote:
>
 So there is no change here.
>>> Actually, we can set acls on fingerprints and then that key wont be able
>>> to upload anymore. That is not something recorded in the keyrings or the
>>> DM list. Obviously that is not something used often (really really
>>> seldom), it is more for "this key is compromised badly, please turn off
>>> anything with it *NOW*" situations, which it's what Helmut meant with the
>>> urgent cases.
>> Could you say more specifically how seldom, and also how long it usually
>> takes between you flicking the emergency switch, and the keyring team
>> pushing an update?
>
> *Really* seldom. I would have to dig and see when, especially for the
> timing thing with keyring team.

Thanks.  Then possibly it is sufficient for ftpmaster just to disable
tag2upload's whole key until the keyring update is pushed.

-- 
Sean Whitton



Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Sean Whitton
Hello,

On Wed 12 Jun 2024 at 11:18am -06, Gunnar Wolf wrote:

> I am mentioning this because I see quite a bit of friction in this
> regard. Some people see your tag2upload proposal as a step to diminish
> Salsa's place in Debian and probably even have it fully replaced in
> the future.

I think you have figured this out for yourself, but just to say:
"some people" is here just one person, Luca.

It is very clear to me that tag2upload cements the position of salsa.

-- 
Sean Whitton



Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Sean Whitton
Hello,

On Thu 13 Jun 2024 at 12:04am +02, Joerg Jaspert wrote:

> Now, again: tag2upload/dgit is not in this category. Not even a little
> nor close to.

Indeed.

> And then something that didn't appear yet: Has anyone asked the Salsa
> admins if they even would like tag2upload? Tell you what, the answer is
> *no*. This does *NOT* belong on Salsa. This should *not* end up on
> Salsa, and we will fight any such move. This is good to go on a
> different host and stay seperate. Different people and different
> machine. It is an addition, probably a useful one, but nothing to
> co-exist on the existing forge.

Thanks for sharing this, good to be on the same page.

-- 
Sean Whitton



Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Andreas Tille
Hi,

thanks to all for this GR.  I like tag2upload in principle.  The only
thing I'm a bit scared about is that it simplifies uploading something
that was never built before on the local machine.  Sure, this can be
done with source-only uploads as well, but tag2upload makes it even
easier.

Maybe I missed it in the long thread and I need to admit that I have not
read the docs, thus the explicit question here:  Is the package
undergoing some CI test (maybe not only building but also autopkgtest
which I'm doing locally for any package I'm uploading) before it is
forwarded to dak?

Thanks again for all your work
  Andreas.

-- 
https://fam-tille.de



Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Sean Whitton
Hello,

On Wed 12 Jun 2024 at 12:37pm +01, Luca Boccassi wrote:

> This is largely in the eye of the beholder as there's no strict
> definition that I am aware of, so one could or could not include
> these, but I do note that what you describe above is not really that
> different from Alioth - that also didn't have merge requests or CIs,
> and we didn't really use the rudimentary ticket system (IIRC it did
> have one? Might be wrong). If Alioth was a forge, and I think it was,
> then this alternative system also sounds like a forge to me.

You can't push work-in-progress to dgit-repos.  You can only push there
by uploading.  We all push work-in-progress things to salsa all the
time, between uploads.

-- 
Sean Whitton



Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Mathias Behrle
* Marco d'Itri: " Re: [RFC] General Resolution to deploy tag2upload" (Wed, 12
  Jun 2024 14:37:25 - (UTC)):

> deb...@kitterman.com wrote:
> 
> >As I understand it, Debian was affected by the xz-utils hack, in part,
> >because some artifacts were inserted into an upstream tarball that were not 
> >represented in the upstream git.  Please explain how use of tag2upload is 
> >relevant to this scenario?  I'm afraid I don't follow.  
> I think that it was assumed, and I agree, that a well-maintained Debian
> git source tree has the upstream branch pulled from the upstream git
> repository, keeping the complete history, and not created locally by
> importing upstream tar release archives.

Just as a note often forgotten in this discussion:

There are upstreams, that don't use git and are even heavily opposed to git.
Hopefully I have nevertheless "well-maintained Debian git source trees" for the
Tryton suite... ;)

-- 

Mathias Behrle
PGP/GnuPG key availabable from any keyserver, ID: 0xD6D09BE48405BBF6
AC29 7E5C 46B9 D0B6 1C71  7681 D6D0 9BE4 8405 BBF6



Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Thomas Goirand

On 6/12/24 00:25, Sean Whitton wrote:

Hello everyone,

This is a draft GR. >

> [...]

Hi Sean,

Thanks for your work on this.

One thing I really dislike, is having a single gpg key to upoload them 
all. I very much preferred the design that Didier explained during 
Debconf Kosovo, where the .changes signature is uploaded together with 
the tagged commit.


Your thoughts?

Cheers,

Thomas Goirand (zigo)

P.S: The thread is huge, I have no time to read it all, sorry if someone 
else also raised the same concern.




Re: [RFC] General Resolution to deploy tag2upload

2024-06-13 Thread Simon Richter

Hi,

On 6/13/24 06:00, Luca Boccassi wrote:


Yes, that's the argument - all Salsa features are bad and "bloat":
issues are bad, teams are bad, CIs are bad, merge requests are bad,
the only thing needed is to push to some git backend, everything
else is bad and unneeded.


The requirements for the archive differ from and sometimes conflict with 
the requirements for collaborative packaging, which differ from and 
sometimes conflict with the requirements for regular development.


It is completely valid, and in my opinion also better, to deploy 
multiple targeted solutions in parallel and make them interface with 
each other than try to graft the missing use cases onto a 95% solution 
that is being built by an external party that has their own product road 
map which does not include these use cases.[1]


The archive does not need to keep the full collaboration history, for 
example -- in fact it is counterproductive, because the archive is being 
mirrored and archived, and we need to keep the size small.


The archive also has a strict requirement *never* to execute any code 
from uploaded packages on machines used in upload processing. From the 
point of view of the archive software, we are dealing with data only. 
This is a requirement salsa cannot ever fulfill.


In the reverse direction, that is also true: the archive maintenance 
software can not perform a CI build because the security posture it 
takes forbids it from doing so, so it can not replace salsa.


GitLab issues are missing the archive integration of the Debian BTS, and 
the tracking of issues forwarded to other bug trackers. Therefore, salsa 
cannot replace the Debian BTS for packaging work. We could try writing 
bots, but it would be a graft, not a targeted solution.


On the other hand, the Debian BTS does not have git integration, so I 
cannot refer to bugs from commits and track individual work items 
between releases this way -- it mainly provides an external view, and an 
interface for users, and is less useful for team-internal collaboration.


Again, we have two different use cases, and two different tools, and 
neither can replace the other. There is no working hierarchy in which 
one is objectively better than the other, because the measuring stick is 
fitness for a particular purpose.[1]


   Simon

[1] this applies to salsa and systemd.