On 2015-08-07 21:27:03 +0200, Vincent Lefevre wrote: > On 2015-08-07 15:54:26 +0200, Antonio Diaz Diaz wrote: > > I have no experience at all rigging tarballs, but it took me just > > minutes to obtain two xz compressed tarballs with very different > > contents that match in size and sum(1). I did it just with an > > editor, ddrescue and data from /dev/urandom, by brute force, without > > any knowledge about the algorithm of sum. And I did it not once, but > > twice. > > sum(1) just gives a 16-bit checksum! So, it suffices to generate > N*65536 random compressed tarballs to get around N collisions with > a given file. Then the only problem is to get the right size, but > if one has random input, it is (almost) not compressible, so that > one will get "almost" the same size for each tarball. By controlling > how compression is done to reach the right size, this should even be > easier.
The following script gives lzip collisions after a few seconds between arbitrary lzip tarballs. This is easier that a collision with a fixed tarball because of the birthday paradox. But one can do something similar by going to at most a few millions to get a collision with some given tarball of about 64 KB. A real test should be based on a hash for which one knows to build collisions only by using well-chosen garbage, like MD5. If xz can contain arbitrary garbage without affecting other parts of the file (while still keeping its validity), this is quite bad, but still safer than garbage at the end, which is IMHO the worst for incremental hashes. #!/usr/bin/env zsh typeset -A a r=test-random head -c 65536 /dev/urandom > $r rm -f foo*.tar.lz for i in {1..1000} do file=file$i tarf=foo$i.tar ln -f $r $file tar cf $tarf $file lzip $tarf rm $file tarz=$tarf.lz s="$(sum $tarz | cut -d' ' -f1) $(command stat -c %s $tarz)" if [[ -n $a[$s] ]] then echo $a[$s] echo $tarz break fi a[$s]=$tarz done -- Vincent Lefèvre <vinc...@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/20150807201702.ga12...@zira.vinc17.org