Hello,

DC>For example, file system caches fill their directories roughly equally
DC>when their paths are created from MD5 hashed keys. Doing something
DC>simple and unique like "URL-encoding" the key to make a legal identifier
DC>(legal in the sense that it is a valid filename) wouldn't distribute as
DC>evenly.

Let me point out that if you are using MD5 hashes for directory spreading
(i.e. to spread a large number of files across a tree of directories so
that no one directory is filled with too many files for efficient
filesystem access), the easy solution is to just append the unique key
that you are hashing to the files involved.

For example, say your keys are e-mail addresses and you just want to use
an MD5 hash to spread your data files over directories so that no one
directory has too many files in it. Say your original key is
[EMAIL PROTECTED] (hex encoded MD5 hash of this is RfbmPiuRLyPGGt3oHBagt).
Instead of just storing the key in the file
R/Rf/Rfb/Rfbm/RfbmPiuRLyPGGt3oHBagt.dat, store the key in the file
[EMAIL PROTECTED] Presto... collisions are impossible.

Humbly,

Andrew

----------------------------------------------------------------------
Andrew Ho               http://www.tellme.com/       [EMAIL PROTECTED]
Engineer                   [EMAIL PROTECTED]          Voice 650-930-9062
Tellme Networks, Inc.       1-800-555-TELL            Fax 650-930-9101
----------------------------------------------------------------------



Reply via email to