On Jun 17, 2007, at 9:55 PM, Thomas Conway wrote:
Hi All, I'm trying to figure out how to maximum performance out of one of my inner loops which involves string hashing.
... mix :: Triple -> TripleThis looks like a version of the Bob Jenkins hash function from burtleburtle.net. I implemented the 32-bit version of this as follows:
mix :: Int32 -> Int32 -> Int32 -> (Int32 -> Int32 -> Int32 -> a) -> a mix a0 b0 c0 k0 = let mixR k a b c = (a-b-c) `xor` (c `shiftR` k) mixL k b c a = (b-c-a) `xor` (a `shiftL` k) mix3 k1 k2 k3 k a b c = let a' = mixR k1 a b c b' = mixL k2 b c a' c' = mixR k3 c a' b' in k a' b' c' in (mix3 13 8 13 $ mix3 12 16 5 $ mix3 3 10 15 $ k0) a0 b0 c0I mention this because your code writes the whole thing out longhand---which might be faster, or might not, but certainly misses the highest-level structural patterns in the original. Your use of a data type to represent triples is probably better nowadays than my rather quirky use of CPS (in other words, this could have been a function Triple -> Triple instead of the rather odd type you see above).
That said, I assume you instrumented your code and determined that hash collisions are actually a bottleneck for you, and that a hash table is the right structure to begin with? I fell back on much- simpler multiplicative hashing schemes for Data.HashTable. A multiply is much faster than vast amounts of bit-fiddling---but of course its collision behavior isn't nearly as good and this can be a problem with large data sets. And note that the multiplicative hashing currently used in Data.HashTable doesn't require prime table sizes; in fact we use powers of two and table doubling. When last I checked the result was faster than Data.Map, but not by much, and using strings probably wipes out that advantage vs. tries.
-Jan-Willem Maessen
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe