(resending to the list since I missed changing from the google groups email that was on the thread, my apologies to those that get this email twice)
On 6/11/15 7:25 PM, Ruchir Arya wrote: > Hi Brane, i didnt get you. How can the server admin modify the content if > contents are signed? Let me give a scenario, suppose we implement Public Key > Infrastructure in SVN, where each client generates its private key and public > key and registers this public key with the server so that anyone can access > the > public key to verify the contents. > > Suppose algorithm works in this way. > > 1. Client computes hash of (contents concatenated with some revision > properties), then sign this hash with its private key and sends this signed > hash with the contents and revision properties. > 2. So, now if server modifies any content, server dont know the private key of > client, so server cant generate valid signed hashed. > 3. Hence i agree with, server can put some garbage data. But server wont be > able to do false accusation on some other clients. (Like in current SVN, > server > can change the name of client in log files, and it can accuse some other > client > for that particular commit. > 4. But after implement PKI, server cant accuse another client. It just can > currupt data, which can be determined too at the time of verification of > signed > hash using public key. Your signing scheme only protects individual revisions. It does not protect the sequence of revisions that make up the repository. Thus a server admin can add or delete revisions. You have no way of knowing if this happened. Perhaps that's sufficient for you if all you care about is validating that a particular identity created the revision. But your scheme as suggested falls apart even for that if you have the server be in charge of holding the public keys and vending them out. Because a server admin can simply modify a revision, generate a new public/private key pair and insert it into the public keys that it vends to you. You might conclude that's fine because a human can pick out the fact that an extra keypair exists for the identity (or possibly for an unknown identity). But when you start considering very large repositories with large user bases that becomes unwieldy. Consider the ASF repository currently has 1,685,036 revisions. Last time I checked there were several thousand committers. It's practically impossible for anyone to know what key pairs would be valid for all of those identities. Any sort of automation would ultimately hide that. You might solve that with a web of trust or a central authority that signs the keypairs (PGP model or SSL model). But both of these have their flaws. I think the whole thing would be very difficult to implement in a way that's useful. If you wanted to try to solve the greater problem of whole repository integrity then you pretty much need to handle signature chaining. I.E. by taking a hash of the predecessor revision and including it in what you sign. The problem with that is that the client doesn't know what the predecessor revision is until it receives back the revision from the commit command. The current design allows SVN to allow multiple clients to build up transactions simultaneously and then only have to take out a exclusive lock on the repository for the final merge. If a predecessor conflicts with the transaction the client receives a out of date error and then the user can run update, deal with conflicts and then commit again. Changing this would result in a significant performance degradation. Of course you could create a modified client that handles the simplistic signing of commits and verification without any server involvement whatsoever. You'd have to get everyone to use it in order to be useful. But it could be required by way of hook scripts. If you really want this I'd implement something like this as a proof of concept. But honestly, despite thinking this was a cool idea at one time I'm pretty much not sold on the actual utility. As Branko alluded to in his email it's actually really hard to change a repository after the fact (other than revision properties). The client and the server both presume that revisions are immutable. Changing them will break things. Which means all anyone can really change is revision properties. So you can change the commit message, the author and the date of the commit. That's not terribly useful. In the case of an open source project, that information is typical transmitted to an email list. Those lists are cached by many people. Modifications of revision properties also trigger emails to these lists. Which means you could search for any potentially malicious modification of the important revision properties by simply going through multiple archives of the mailing list and comparing it to the repository. That sounds like a lot of work, but then so is what you're proposing. And yet I think at the end your solution is not any better, and in fact is far weaker. Because it'd be practically impossible to modify all the mailing lists. Which leaves a malicious server admin to stop the mail while they did something untoward and then all they can really achieve is changing the revision properties. It's simply not worth it. Solving it might be an interesting but academic puzzle. But I don't think it has any practical purpose. Signing commits for a distributed version control system is of course a very different matter. That's why you see git supporting this and SVN without this support.

