Dave Sill <[EMAIL PROTECTED]> wrote: > > The first thing I did was Google for "hash prime modulo even > distribution". That turns up many repetitions of Charles' assertion, > without proof or explanation. [...] > Being a ``Profile, don't speculate'' kind of guy, I decided to write a > little program to test modulo hashes, which is attached to this > message for your entertainment. > > The result is that I can't see any effect of primality of the hash > table size on distribution. Funny, I can reproduce it easily. Sure your random numbers are random? With the attached Python script (it depends on the popular stats.py and pstat.py modules), analyzing 250000 15-bit random integers (read from a text file, one per line), I get the following: [charlesc@charon personal]$ ./buckets.py 12 13 A: 12 buckets count: 250000 mean: 20833.3333 std. dev.: 196.3153 B: 13 buckets count: 250000 mean: 19230.7692 std. dev.: 103.8646 The effect does appear to diminish significantly for large values, though. Charles -- ----------------------------------------------------------------------- Charles Cazabon <[EMAIL PROTECTED]> GPL'ed software available at: http://www.qcc.sk.ca/~charlesc/software/ -----------------------------------------------------------------------
#!/usr/bin/python import whrandom import string import stats import sys ints = map (int, map (string.strip, open ('randints.txt').readlines ())) ####################################### def fill_buckets (numbuckets): buckets = [0] * numbuckets for i in ints: x = i % numbuckets buckets[x] = buckets[x] + 1 return buckets ####################################### def main (a, b): A = fill_buckets (a) B = fill_buckets (b) for L, name in ((A, 'A'), (B, 'B')): sum = reduce (lambda a, b: a+b, L) mean = stats.mean (L) dev = stats.stdev (L) print '%s: %i buckets' % (name, len (L)) print ' count: %10i' % sum print ' mean: %7.04f' % mean print ' std. dev.: %7.04f' % dev print ####################################### if __name__ == '__main__': a = int (sys.argv[1]) b = int (sys.argv[2]) main (a, b)