I've done similar testing (looking for collisions within sha1) with millions of strings and their hashes. I didn't actually expect to find any collisions but I wanted to try it anyway. In the process I realized that ruby's Hash wouldn't work for this project because of memory limitations (if I recall correctly) and I had to find another way.
When it was clear there were no collisions I tested my generated hashes for some simple patterns e.g. Which char (0,1,2...d,e,f) is most likely to appear in position 1 of the hash, position 2 of the hash, etc. And I did find that some chars were more likely to occur for a given position. I brushed off my stats knowlegde to determine if the results were statistically significant. Turns out that none of my findings were statistically significant. Though I didn't discover anything new I'm glad I did it. This exercise helped me become a little bit better. It reminds me of being in school and writing a chess program. The purpose of the exercise isn't to discover anything groundbreaking but rather it's to improve my own skill set. I have enjoyed the discussion in this thread and the variety of ways to approach the problem. Thank you, Nicholas Stewart /* PLUG: http://plug.org, #utah on irc.freenode.net Unsubscribe: http://plug.org/mailman/options/plug Don't fear the penguin. */