On Fri, 04 Feb 2005 11:12:50 -0800 (PST), Ted Zlatanov
<[EMAIL PROTECTED]> wrote
> > Mainly common letters (like vowels) fill much faster then say w, x, z.
> > I am probably going to impliment a tracked integer id per email
> > address and fill the hash in reverse order. Since the last digit of
> > the id increments it will naturally spread accross the hash. The only
> > factor that will impact the balance is deletion at that point.
> 
> > This has a lot of drawbacks because of the need to track the
> > maildirId, and manage the increment of it. Does anyone else have any
> > other methods they use, and would share?
> 
> Why not just do MD5 hashing of the name and be done with it?  I would
> expect any home-grown hashing scheme to be less capable.  You also
> don't need to track the currently allocated ID.  Just make sure you
> can handle more than one user per MD5 hash, but your distribution will
> be very close to ideal because MD5 collisions are so unlikely.  Then
> just break things up by the first N characters of the hash:

I had considered using an MD5/SHA hash at one point since it would
easily mitigate the vowel issue. The only problem I could see is that
there is no insurance regarding balance. This can mitigate this by
making the hash very deep.

Both systems seem good, I think I will actually need to run some
simulations to see how balanced the MD5 system would be with my
current set of customer data.

Your input is appreciated.
-- 
Sean

Reply via email to