Oleg, Here is a little more information on the use of CRC32 as a hash function, with some warning caveats:
http://home.comcast.net/~bretm/hash/8.html Regards, Ken On Tue, Nov 04, 2008 at 03:15:44PM -0600, Kenneth Marshall wrote: > On Tue, Nov 04, 2008 at 11:32:47PM +0300, Oleg Bartunov wrote: > > Just interested if you repeat your tests not with cracklib-dict, > > but using 8-bit words. From our experience we found many hash functions > > are optimized for 7-bit words and produce too many collisions > > for 8-bit words. That's why we use crc32. > > > > Oleg > > I think that the lines: > > uint32 from 1-1648379 309 319 347 > (uint32 1-1948379)*256 309 314 304 > (uint32 1-1948379)*16 310 314 324 > "a"uint32 (i.e. a00001,a0002...) 320 321 312 > > uint32uint32 (i.e. uint64) 321 287 309 > > can count as 8-bit words if taken a byte at a time. In fact > that is how hash_any() treats them, as a character string > and a length. One of the design goals of the original 1997 > hash function in lookup2 and the 2006 update in lookup3 is > to support keys of arbitrary arrangements of bits. I can run > any additional checks that you want since the test harness > is perl with Inline::C. If you are using crc32 his article in > Dr. Dobbs shows that CRC has a "2 into 1" funnel-15 and an > "11 into 10" funnel-100 unless you are using a generalized > CRC. Also, unless you can inline your CRC the Jenkins lookup3 > is 5n+20 where CRC is 9n+3. > > Regards, > Ken -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers