[BTW, should we move to Haskell-Cafe?] > Because updates are not so infrequent that I want to pay the cost of > replicating the entire array every update (or every ten!). I'm > willing to exchange *some* read time for faster update. Also, because > small array copies may be sufficiently faster than tree traversals > that I may pay very little extra for faster reads. > > FYI, my current code looks like this:
I'm afraid I'm somewhat confused. The hash-table related code makes the copy of the whole array every purgatory-size times. Thus, given the sequence of 'n' unique inserts (which don't trigger the major_rebuild), at most 2*n/|purgatory| elements will be moved. As I understand your code, in particular, > insert (ArrMap proto toBase ht) key elt = ArrMap proto toBase newHT > where newHT= insert' proto ht (toBase key) elt > insert' _ (HT x _) [] = HT x > insert' proto (HT Nothing y) key = insert' proto (HT (Just proto) y) key > insert' p (HT (Just ar) y) (k:ey) = \val -> HT (Just $ newArray val) y > where newArray val = ar//[(k,insert' p (ar!k) ey val)] you make a copy of an array of the size |base| |key| times. _If_ the tree is kept balanced and filled, then the sequence of n inserts will copy (log n)/(log |base|)*|base| elements. For small n and large |base|, that can be a lot. For example, > testMap=newMap (chr 0) (chr 255) id > main = do print $ lookup (insert testMap "abc" (Just "def")) "abc" involves copying a 256-element array three times. Right? I guess we have come to the point where we really need to know the distribution of reads and writes, the length of the key (and if it is bounded), and the distribution of key values. We must also be sure of the cost basis. So far, we have concentrated only on the traversal through and moving of elements as the function of the size of the map. This is clearly not sufficient, as Andrew Bromage pointed out. _______________________________________________ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell