On Fri, 04 Feb 2005 11:12:50 -0800 (PST), Ted Zlatanov <[EMAIL PROTECTED]> wrote > > Mainly common letters (like vowels) fill much faster then say w, x, z. > > I am probably going to impliment a tracked integer id per email > > address and fill the hash in reverse order. Since the last digit of > > the id increments it will naturally spread accross the hash. The only > > factor that will impact the balance is deletion at that point. > > > This has a lot of drawbacks because of the need to track the > > maildirId, and manage the increment of it. Does anyone else have any > > other methods they use, and would share? > > Why not just do MD5 hashing of the name and be done with it? I would > expect any home-grown hashing scheme to be less capable. You also > don't need to track the currently allocated ID. Just make sure you > can handle more than one user per MD5 hash, but your distribution will > be very close to ideal because MD5 collisions are so unlikely. Then > just break things up by the first N characters of the hash:
I had considered using an MD5/SHA hash at one point since it would easily mitigate the vowel issue. The only problem I could see is that there is no insurance regarding balance. This can mitigate this by making the hash very deep. Both systems seem good, I think I will actually need to run some simulations to see how balanced the MD5 system would be with my current set of customer data. Your input is appreciated. -- Sean
