https://bugzilla.wikimedia.org/show_bug.cgi?id=28419

--- Comment #51 from Tyler Romeo <tylerro...@gmail.com> 2012-07-10 12:45:31 UTC 
---
He mentions in the article that the index is rather large (larger than the
actual data). It's possible his data is inaccurate or unrepresentative, but
considering the rather random nature of hash functions I doubt this is so. His
technique will require a large usage of RAM. However, the other thing that he
mentions is the ability for the attacker to steal the database in the first
place. Assuming an attacker gains access to a database dump, the point he is
trying to make is that it is much harder for an attacker to steal a 2 TB of
data than 200 MB of data.

Regardless, that doesn't mean the technique is sound and safe to use. He makes
the disastrous assumption that hash functions work just like PRNGs, but they
don't. The guarantee for hash functions is that given two message x and y, P(x
!= y => H(x) = H(y)) is extremely close to zero. In English, the chances of two
given messages having the same hash is very small, which is why it works for
password authentication. The case he is making is not the comparison of two
hashes, but the existence of a hash in a set. In other words, given a set of
existing messages S and a new message x, what is P(x not in S => H(x) in H(S)).
If hash functions were like PRNGs, then this would be a simply calculation (the
answer to which is roughly |S|/sizeof(H(x), which provides reasonably low
probabilities for hash functions at least 64 bits in length). But the
distribution of hash functions is not exactly random and is a lot more
complicated than that. I would wait for researchers to analyze this scenario
and calculate that probability with accuracy before ever using such a system.

An alternative, and much simpler solution, would be to use scrypt. It's both
CPU and memory intensive, so it protects against brute force attacks with the
same amount of protection as the method proposed in the article.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

_______________________________________________
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to