On Jan 4, 2006, at 5:30 AM, Simon Marlow wrote:

On 30 December 2005 01:23, Jan-Willem Maessen wrote:

Probably.  The minimum table chunk size was rather large.  I have
been experimenting (tests are running even as I type) with alternate
implementations of Data.HashTable.  So far the winning implementation
is one based on multiplicative hashing and simple table doubling.
This seems to be consistently competitive with / faster than
Data.Map.  At this point my inclination is to make the whole suite
available:

Data.HashTable.Class
Data.HashTable.DataMap
Data.HashTable.Doubling
Data.HashTable.Flat
Data.HashTable.Freeze
Data.HashTable.Modulus
Data.HashTable.Multiplicative
Data.HashTable.Original
Data.HashTable.Test
Data.HashTable

I've separated out the hashing technique (Modulus, Multiplicative)
from the hash table implementation.  Note that a few of the above are
bogus, and this is only a fraction of the implementations tried thus
far.

I need to get distribution clearance for a bunch of this code from my
employer, and figure out how to package it.  The latter may be
tricky, as Data.Hashtable is currently rather intimately tied to a
bunch of rather tricky bits of GHC.

I wonder if you could put together a drop-in replacement for
Data.HashTable that we can incorporate?  There's not much point in us
still providing the inefficient version as standard after you've done
all this work to figure out how to do it better.

That'd be good---though what qualifies as the "efficient version" will, I think, change based on the GC changes you said you made (I haven't had the time/patience to try to bootstrap the development version of GHC). This is one of the reasons I've been saving so many bread crumbs along the way. All of the above will work as drop-in Data.HashTable replacements, with Data.HashTable simply re-exporting multiplicative hashing and table doubling.

The tricky bit, as you probably know, is the rather intimate ties between Data.HashTable and the rest of the builtin code. This requires the shipping Data.HashTable library to import most things from the source, rather than from the place where they're made publicly available. Do you have any suggestions as to how I might go about testing that integration?

-Jan


Cheers,
        Simon

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to