I've done similar testing (looking for collisions within sha1) with
millions of strings and their hashes.  I didn't actually expect to
find any collisions but I wanted to try it anyway.  In the process I
realized that ruby's Hash wouldn't work for this project because of
memory limitations (if I recall correctly) and I had to find another
way.

When it was clear there were no collisions I tested my generated
hashes for some simple patterns e.g. Which char (0,1,2...d,e,f) is
most likely to appear in position 1 of the hash, position 2 of the
hash, etc.  And I did find that some chars were more likely to occur
for a given position.

I brushed off my stats knowlegde to determine if the results were
statistically significant.  Turns out that none of my findings were
statistically significant.

Though I didn't discover anything new I'm glad I did it.  This
exercise helped me become a little bit better.  It reminds me of being
in school and writing a chess program.  The purpose of the exercise
isn't to discover anything groundbreaking but rather it's to improve
my own skill set.

I have enjoyed the discussion in this thread and the variety of ways
to approach the problem.

Thank you,
Nicholas Stewart

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/

Reply via email to