Re: Switching from SHA1 to a checksum type without known collisions in 1.15 working copy format

Karl Fogel Mon, 26 Dec 2022 17:56:48 -0800

On 20 Dec 2022, Evgeny Kotkov via dev wrote:

[Moving discussion to a new thread]
We currently have a problem that a working copy relies on thechecksum typewith known collisions (SHA1). A solution to that problem is toswitch to adifferent checksum type without known collisions in one of thenewer working
copy formats.
Since we plan on shipping a new working copy format in 1.15, thisseems tobe an appropriate moment of time to decide whether we'd also wantto switch
to a checksum type without known collisions in that new format.
Below are the arguments for including a switch to a differentchecksum type
in the working copy format for 1.15:
1) Since the "is the file modified?" check now compareschecksums, leavingeverything as-is may be considered a regression, because itwouldintroduce additional cases where a working copy currentlyrelies on
  comparing checksums with known collisions.
2) We already need a working copy format bump for thepristines-on-demandfeature. So using that format bump to solve the SHA1 issuemight reducethe overall number of required bumps for users (assuming thatwe'll still
  need to switch from SHA1 at some point later).
3) While the pristines-on-demand feature is not released,upgrading with aswitch to the new checksum type seems to be possible withoutrequiring anetwork fetch. But if some of the pristines are optional, welose thepossibility to rehash all contents in place. So we might findourselveshaving to choose between two worse alternatives of eitherrequiring anetwork fetch during upgrade or entirely prohibiting anupgrade of
  working copies with optional pristines.

Thoughts?


A few thoughts:

First, Daniel Shahaf raises the question of whether there isreally a problem here. I.e., Why do we care about possiblecollisions when they're unlikely to happen in practice unlessdeliberately caused?

My answer is: we should care because it's very difficult toimagine all the consequences -- including but not limited toclever deliberate attacks -- that might follow from losing aproperty we formerly had. The hash semantics we have alwaysassumed are "If the file is modified, the hash will change." Whenthose semantics change, we don't need to be able to thinkimmediately of a specific problematic scenario to know that thisis a significant development. We've lost the guarantee; that'senough to be worth worrying about.


BUT, if you want a scenario, here's one:

I have put WordPress installations under Subversion versioncontrol before. Once, I detected an attack on one of thoseWordPress servers when one of the things the attacker did wasmodify some of the WordPress scripts on the server. Those filesshowed up as modified when I ran 'svn st', and from there I ran'svn diff' and figured out what had happened. But a super-carefulattacker could make modifications that leave theversion-controlled files with the same SHA1 hash they had before,thus making it harder to detect the attack.

Yes, I realize there are other ways to detect modifications, andthat random attackers are unlikely to take the trouble to preservehashes. On the other hand, a well-resourced spear-fishingattacker who knows something about the usage of SVN at theirtarget might indeed try a hash-preserving approach to breaking in.The point is, if we're counting on the hashes having certainsemantics, then our users are counting on it too. If SHA1 nolonger has those semantics, we should upgrade.

Second, +1 to what Branko said: we should upgrade to a new hashwhen we upgrade a working copy anyway, but new clients shouldstill be able to handle the old hash in old working copies withoutupgrading them.

Now, how hard would this be to actually implement? Thepristineless-format WC upgrade is an opportunity to make otherformat changes, but I'd hate to block the release of pristinelessworking copies on this...


Best regards,
-Karl

Re: Switching from SHA1 to a checksum type without known collisions in 1.15 working copy format

Reply via email to