Possible vulnerability to SHA-1 collisions
Evil Guy creates 2 files, 1 evil and 1 innocuous, with the same SHA-1 checksum (including Git header). Mr. Evil creates a local branch with an innocuous name like “test-bugfix”, and adds a commit containing a reference to the evil file. Separately, using a sockpuppet, Evil Guy creates an innocuous bugfix (very likely to be accepted) containing the innocuous file, and submits it to Good Guy. Before Good Guy can commit the bugfix, Evil Guy pushes the evil branch to Github, and then immediately deletes it; or equivalently --force pushes any innocuous commit on top of it. (This is unlikely to arouse suspicion, and he can always say he deleted it because it didn’t work.) Git keeps unreferenced objects around for a few weeks, so when Good Guy commits the patch and pushes to Github, an object with an sha1sum that matches the good file will already exist in the main repository. Since Git keeps the local copy of files when sha1sums match, the main Github repository will then contain the evil file associated with Good Guy’s commit. Any users cloning from Github will get the evil version. This is an exploit. And Good Guy’s local repository will contain the good file; he will not notice anything amiss unless he nukes his local repository and clones from Github again. Even when the compromise is discovered, there will be no reason to suspect Evil Guy; the evil file seems to have been committed by Good Guy. Previous discussion about hash collisions in Git seems to conclude that they aren’t a security threat. See http://stackoverflow.com/questions/9392365/how-would-git-handle-a-sha-1-collision-on-a-blob/9392525#9392525, Linus Torvalds arguing that Git’s security doesn’t depend on SHA-1 collision resistance. This proposed exploit does not involve social engineering, or any good guys failing to spot or accepting patches containing evil data (what Good Guy accepts is a genuine bugfix). It contaminates the main public repository in a way that Good Guy won’t immediately notice. It does not require a second-preimage attack; Bad Guy creates both versions of the file. While this does require the bad guy to have commit access, the bad guy can avoid suspicion after the attack. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Possible vulnerability to SHA-1 collisions
I don't think there is an issue the way you have tried to describe this scenario. On Sat, Nov 24, 2012 at 3:12 AM, Michael Hirshleifer <111...@caltech.edu> wrote: > Evil Guy creates 2 files, 1 evil and 1 innocuous, with the same SHA-1 > checksum (including Git header). Mr. Evil creates a local branch with an > innocuous name like “test-bugfix”, and adds a commit containing a reference > to the evil file. Separately, using a sockpuppet, Evil Guy creates an > innocuous bugfix (very likely to be accepted) containing the innocuous file, > and submits it to Good Guy. Before Good Guy can commit the bugfix, Evil Guy > pushes the evil branch to Github, and then immediately deletes it; or > equivalently --force pushes any innocuous commit on top of it. (This is > unlikely to arouse suspicion, and he can always say he deleted it because it > didn’t work.) Here you assume Evil Guy has write access to the same repository as Good Guy. Lets assume this is possible, e.g. Evil Guy is actually impersonating White Hat because he managed to steal White Hat's credentials through a compromised host. Typically Evil Guy doesn't have write access to Good Guy's repository, and thus can't introduce objects into it without Good Guy being the one that creates the objects. But lets just keep he assumption that Evil Guy can write to the same repository as Good Guy, and that he managed to create the bad branch and delete it, leaving the bad object in an unreachable state for 2 weeks. > Git keeps unreferenced objects around for a few weeks, so when Good Guy > commits the patch and pushes to Github, an object with an sha1sum that > matches the good file will already exist in the main repository. Since Git > keeps the local copy of files when sha1sums match, the main Github > repository will then contain the evil file associated with Good Guy’s > commit. Any users cloning from Github will get the evil version. This is an > exploit. Typically... Git will fail with an error message when Good Guy pushes. Good Guy's client will (rightly) believe that the object doesn't exist on the remote side, after all it is unreachable. So his client will include it in the pack being transmitted during push. When this pack arrives on the remote side, the remote will identify it already has an object named the same as an object coming in the pack. The remote will do a byte-for-byte compare of both objects. As soon as a single byte differs, it will abort with an error. At this point Good Guy can't push to his repository. `git gc --expire=now` will fix the repository by removing the unreachable object, at which point Evil Guy's evil object is gone. > And Good Guy’s local repository will contain the good file; he will not > notice anything amiss unless he nukes his local repository and clones from > Github again. Even when the compromise is discovered, there will be no > reason to suspect Evil Guy; the evil file seems to have been committed by > Good Guy. See above. Good Guy would have noticed something is amiss because the object he sent already existed and didn't match. > Previous discussion about hash collisions in Git seems to conclude that they > aren’t a security threat. See > http://stackoverflow.com/questions/9392365/how-would-git-handle-a-sha-1-collision-on-a-blob/9392525#9392525, > Linus Torvalds arguing that Git’s security doesn’t depend on SHA-1 collision > resistance. This is largely true because there are additional defenses (e.g. the byte for byte compare on identical objects), and for projects like the Linux kernel there are many eyes looking at files all of the time. Anything that is amiss would be announced quickly on LKML and discussed until the root cause is identified and resolved. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Possible vulnerability to SHA-1 collisions
On Sat, Nov 24, 2012 at 10:09:31AM -0800, Shawn O. Pearce wrote: > On Sat, Nov 24, 2012 at 3:12 AM, Michael Hirshleifer <111...@caltech.edu> > wrote: > > Evil Guy creates 2 files, 1 evil and 1 innocuous, with the same SHA-1 > > checksum (including Git header). Mr. Evil creates a local branch with an > > innocuous name like “test-bugfix”, and adds a commit containing a reference > > to the evil file. Separately, using a sockpuppet, Evil Guy creates an > > innocuous bugfix (very likely to be accepted) containing the innocuous file, > > and submits it to Good Guy. Before Good Guy can commit the bugfix, Evil Guy > > pushes the evil branch to Github, and then immediately deletes it; or > > equivalently --force pushes any innocuous commit on top of it. (This is > > unlikely to arouse suspicion, and he can always say he deleted it because it > > didn’t work.) > > Here you assume Evil Guy has write access to the same repository as > Good Guy. Lets assume this is possible, e.g. Evil Guy is actually > impersonating White Hat because he managed to steal White Hat's > credentials through a compromised host. Typically Evil Guy doesn't > have write access to Good Guy's repository, and thus can't introduce > objects into it without Good Guy being the one that creates the > objects. > > But lets just keep he assumption that Evil Guy can write to the same > repository as Good Guy, and that he managed to create the bad branch > and delete it, leaving the bad object in an unreachable state for 2 > weeks. Actually, it is somewhat easier on GitHub, because we share objects between forks of a repository via the alternates mechanism. So if you can publicly fork the project and push a branch to your fork, you can write to the shared object database. This applies not just to GitHub, but to any hosting service which shares object databases between projects (I do not know offhand if other hosting providers like Google Code do this). But as you noted later in your email, the byte-for-byte comparison on object collision will let us detect this case when the good guy tries to push and abort. -Peff PS I also think the OP's "sockpuppet creates innocuous bugfix" above is easier said than done. We do not have SHA-1 collisions yet, but if the md5 attacks are any indication, the innocuous file will not be completely clean; it will need to have some embedded binary goo that is mutated randomly during the collision process (which is why the md5 attacks were demonstrated with postscript files which _rendered_ to look good, but contained a chunk of random bytes in a spot ignored by the postscript interpreter). -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Possible vulnerability to SHA-1 collisions
At 18:07 -0500 27 Nov 2012, Jeff King wrote: PS I also think the OP's "sockpuppet creates innocuous bugfix" above is easier said than done. We do not have SHA-1 collisions yet, but if the md5 attacks are any indication, the innocuous file will not be completely clean; it will need to have some embedded binary goo that is mutated randomly during the collision process (which is why the md5 attacks were demonstrated with postscript files which _rendered_ to look good, but contained a chunk of random bytes in a spot ignored by the postscript interpreter). I don't think that really saves us though. Many formats have parts of the file which will be ignored, such as comments in source code. With the suggested type of attack, there isn't a requirement about which version of the file is modified. So the attacker should be able to generate a version of a file with an innocuous change, get the SHA-1 for that, then add garbage comments to their malicious version of the file to try to get the same SHA-1. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Possible vulnerability to SHA-1 collisions
On Tue, Nov 27, 2012 at 06:30:17PM -0500, Aaron Schrab wrote: > At 18:07 -0500 27 Nov 2012, Jeff King wrote: > >PS I also think the OP's "sockpuppet creates innocuous bugfix" above is > > easier said than done. We do not have SHA-1 collisions yet, but if > > the md5 attacks are any indication, the innocuous file will not be > > completely clean; it will need to have some embedded binary goo that > > is mutated randomly during the collision process (which is why the > > md5 attacks were demonstrated with postscript files which _rendered_ > > to look good, but contained a chunk of random bytes in a spot ignored > > by the postscript interpreter). > > I don't think that really saves us though. Many formats have parts > of the file which will be ignored, such as comments in source code. Agreed, it does not save us unconditionally. It just makes it harder to execute the attack. Would you take a patch from a stranger that had a kilobyte of binary garbage in a comment? A more likely avenue would be a true binary file where nobody is expected to read the diff. > With the suggested type of attack, there isn't a requirement about > which version of the file is modified. So the attacker should be > able to generate a version of a file with an innocuous change, get > the SHA-1 for that, then add garbage comments to their malicious > version of the file to try to get the same SHA-1. That's not how birthday collision attacks usually work, though. You do not get to just mutate the malicious side and leave the innocuous side untouched. You are mutating both sides over and over and hoping to find a matching sha1 from the "good" and "evil" sides. Of course, I have not been keeping up too closely with the efforts to break sha-1. Maybe there is something more nefarious about the current attacks. I am just going off my recollection of the md5 collision attacks. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Possible vulnerability to SHA-1 collisions
On 11/28/2012 01:27 AM, Jeff King wrote: > On Tue, Nov 27, 2012 at 06:30:17PM -0500, Aaron Schrab wrote: > >> At 18:07 -0500 27 Nov 2012, Jeff King wrote: >>> PS I also think the OP's "sockpuppet creates innocuous bugfix" above is >>> easier said than done. We do not have SHA-1 collisions yet, but if >>> the md5 attacks are any indication, the innocuous file will not be >>> completely clean; it will need to have some embedded binary goo that >>> is mutated randomly during the collision process (which is why the >>> md5 attacks were demonstrated with postscript files which _rendered_ >>> to look good, but contained a chunk of random bytes in a spot ignored >>> by the postscript interpreter). >> >> I don't think that really saves us though. Many formats have parts >> of the file which will be ignored, such as comments in source code. > > Agreed, it does not save us unconditionally. It just makes it harder to > execute the attack. Would you take a patch from a stranger that had a > kilobyte of binary garbage in a comment? > > A more likely avenue would be a true binary file where nobody is > expected to read the diff. > >> With the suggested type of attack, there isn't a requirement about >> which version of the file is modified. So the attacker should be >> able to generate a version of a file with an innocuous change, get >> the SHA-1 for that, then add garbage comments to their malicious >> version of the file to try to get the same SHA-1. > > That's not how birthday collision attacks usually work, though. You do > not get to just mutate the malicious side and leave the innocuous side > untouched. You are mutating both sides over and over and hoping to find > a matching sha1 from the "good" and "evil" sides. > > Of course, I have not been keeping up too closely with the efforts to > break sha-1. Maybe there is something more nefarious about the current > attacks. I am just going off my recollection of the md5 collision > attacks. > AFAIR, collision attacks can be executed with a 2^51 probability (with a 2^80 claim, that's pretty bad), but preimage attacks are still stuck very close to the claimed 2^160. That means every attack involving SHA1 means Mr. Malicious creates both the involved files or does exceptional research without sharing it. I think git's job is to make sure that write access to only one of the repositories is insufficient to launch an attack. If the attacker manages to change all repositories involved then the hash function used is really quite irrelevant. -- Andreas Ericsson andreas.erics...@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 Considering the successes of the wars on alcohol, poverty, drugs and terror, I think we should give some serious thought to declaring war on peace. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html