On 12/30/2011 05:44 PM, Gerd Stolpmann wrote: > 1) Avoid hash tables in contexts where security is relevant. The > alternative is Set (actually a balanced binary tree), which does not > show this problem.
Highly recommended. Nothing beats guaranteed O(log n) operations. > 2) Use cryptographically secure hash functions. Hopeless: with a hash size of 30 bits, as in Caml, or even 64 bits, there are no cryptographically secure hash functions. > 3) Use "randomized" hash tables. The trick here is that there is not a > single hash function h anymore, but a family h(1)...h(n). When the hash > table is created, one of the functions is picked randomly. This makes it > impossible to craft an attack request, because you cannot predict the > function. Indeed. The optional "seed" parameter to Hashtbl.create does exactly this in the new implementation of Hashtbl (the one based on Murmur3). > So, the question is how to do 3). I see two problems here: > > a) how to define the family of hash functions. Is it e.g. sufficient to > introduce an initialization vector for the Murmurhash algorithm, and > fill it randomly? IIRC, the Web pages for the Murmur family of hashes gives some statistical evidence that this approach works. > How to get a random number that is good enough? Hmm. /dev/random is your friend on the platforms that support it. Otherwise, there's always the Random module, but Random.self_init isn't very strong. - Xavier Leroy -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs