On Nov 29, 2013, at 6:26 PM, Scott Ribe <scott_r...@elevated-dev.com> wrote:

> I recently (last week, no kidding) investigated this question, and went with 
> murmur hash.

Murmur is an excellent 32-bit hash function, but it’s not suitable for Graham’s 
use case where a collision would result in data loss.


>> In this scheme, if there is a hash collision, you lose user data. That 
>> should be a non-starter. You *must* do a full bytewise comparison in case of 
>> collision. 
> a 1 in 2^64, or 2^128, or 2^256 chance, so no, it is just fine.

It’s worse than that, actually, because of what’s known as the “birthday 
paradox”[1]. In a group of 23 people, there’s a 50% chance that two of them 
have the same birthday. Similarly, as the number of objects you’re hashing 
grows, the chance of a collision grows much more rapidly, because it’s 
proportional to the number of _pairs_ of objects.

Generally with a 160-bit hash function the odds of a collision for any 
reasonably-sized data set are negligible, as long as the hash function is 
reasonably random. (The trust in the randomness of SHA-1 is eroding, though.)

—Jens

[1] http://en.wikipedia.org/wiki/Birthday_paradox
[2] http://en.wikipedia.org/wiki/Birthday_attack
_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Reply via email to