My cryptographically-inclined friend suggested we use a universal hash function or something a bit stronger, such as VHASH.
These functions take a "key", which we could choose at random and fix in the code. VHASH outputs 64-bit digests with collision probability 2^61, so in expectation you'd need to hash 2^30 files before you saw a collision. If that wasn't good enough, we could compute two VHASH digests with different keys and concatenate them. -Justin On Tue, Oct 19, 2010 at 2:31 PM, Martin Pool <m...@sourcefrog.net> wrote: > On 20 October 2010 08:15, Joel Rosdahl <j...@rosdahl.net> wrote: >> MD4 has been there from the start and neither Tridge or I have seen any >> reason to switch it. MD5, SHA1 and other even more modern cryptograhic >> hash functions are indeed stronger but also slower, and the increased >> resistance against various crypto attacks doesn't seem necessary in a >> tool like ccache. That said, I'm sure there nowadays may exist hash >> functions that are both better (i.e., with lower collision rate) AND >> faster than MD4. Do you (or anyone else) know of any with properties >> that would be a good fit for ccache? > > I think any of the cryptographic hash functions will have an even > distribution of outputs, so nothing else will give stronger resistance > to accidental collision. The only problem with MD4 is that it might > be vulnerable to malicious collisions (which seems pointless in ccache > as it currently exists) and that others might be faster. > > -- > Martin > _______________________________________________ > ccache mailing list > ccache@lists.samba.org > https://lists.samba.org/mailman/listinfo/ccache > _______________________________________________ ccache mailing list ccache@lists.samba.org https://lists.samba.org/mailman/listinfo/ccache