https://bugzilla.wikimedia.org/show_bug.cgi?id=28419
--- Comment #51 from Tyler Romeo <tylerro...@gmail.com> 2012-07-10 12:45:31 UTC --- He mentions in the article that the index is rather large (larger than the actual data). It's possible his data is inaccurate or unrepresentative, but considering the rather random nature of hash functions I doubt this is so. His technique will require a large usage of RAM. However, the other thing that he mentions is the ability for the attacker to steal the database in the first place. Assuming an attacker gains access to a database dump, the point he is trying to make is that it is much harder for an attacker to steal a 2 TB of data than 200 MB of data. Regardless, that doesn't mean the technique is sound and safe to use. He makes the disastrous assumption that hash functions work just like PRNGs, but they don't. The guarantee for hash functions is that given two message x and y, P(x != y => H(x) = H(y)) is extremely close to zero. In English, the chances of two given messages having the same hash is very small, which is why it works for password authentication. The case he is making is not the comparison of two hashes, but the existence of a hash in a set. In other words, given a set of existing messages S and a new message x, what is P(x not in S => H(x) in H(S)). If hash functions were like PRNGs, then this would be a simply calculation (the answer to which is roughly |S|/sizeof(H(x), which provides reasonably low probabilities for hash functions at least 64 bits in length). But the distribution of hash functions is not exactly random and is a lot more complicated than that. I would wait for researchers to analyze this scenario and calculate that probability with accuracy before ever using such a system. An alternative, and much simpler solution, would be to use scrypt. It's both CPU and memory intensive, so it protects against brute force attacks with the same amount of protection as the method proposed in the article. -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l