On Mon, Oct 07, 2013 at 12:41:58AM +0200, Tomas Vondra wrote: > > 2. Consider using a simpler/faster hash function, like FNV[1] or Jenkins[2]. > > For fun, try not hashing those ints at all and see how that performs > > (that, > > I think, is what you get from HashSet<int> in Java/C#). > > I've used crc32 mostly because it's easily available in the code (i.e. > I'm lazy), but I've done some quick research / primitive benchmarking > too. For example hashing 2e9 integers takes this much time: > > FNV-1 = 11.9 > FNV-1a = 11.9 > jenkins = 38.8 > crc32 = 32.0 > > So it's not really "slow" and the uniformity seems to be rather good. > > I'll try FNV in the future, however I don't think that's the main issue > right now. > Hi Tomas,
If you are going to use a function that is not currently in the code, please consider xxhash: http://code.google.com/p/xxhash/ Here are some benchmarks for some of the faster hash functions: Name Speed Q.Score Author xxHash 5.4 GB/s 10 MumurHash 3a 2.7 GB/s 10 Austin Appleby SpookyHash 2.0 GB/s 10 Bob Jenkins SBox 1.4 GB/s 9 Bret Mulvey Lookup3 1.2 GB/s 9 Bob Jenkins CityHash64 1.05 GB/s 10 Pike & Alakuijala FNV 0.55 GB/s 5 Fowler, Noll, Vo CRC32 0.43 GB/s 9 MD5-32 0.33 GB/s 10 Ronald L. Rivest SHA1-32 0.28 GB/s 10 Regards, Ken -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers