Hello Jules, Wednesday, August 27, 2008, 7:59:55 PM, you wrote:
>>> To get "much better efficient" than a trie, the hash function has to be >>> so fast that it is faster than following (log n) pointers >> >> afaiu, trie search follows n pointers > No. > "n" is the number of strings in my data set (dictionary). > If I have "n" strings the average string length is asymptotically, in > some sense, "log n". Of course for particular data sets it's may be more . > But "log n" is the length of the shortest unique coding, it's also the > number of characters you typically need to traverse before you have > reached a unique prefix, at which point your trie can short-circuit. > I appreciate that I didn't define my terminology ;) You might have a > different "n". > I repeat my challenge "Prove it". I will be interested to see how much a > good immutable hash outperforms Data.Map. > I would then also be interested to see how much it outperforms a decent > Data.Map (such as your own AVL one) and a decent trie. 1) i don't have time/interest to do it, so i just pushed the idea 2) what you have proved there is that for set of n randomly chosen strings trie lookup need O(log n) time - the same is true for hash 3) practical performance of trie will suffer by the need to follow several trie levels each providing several elements to check. OTOH hash lookup will usually need only 1-2 checks, including one check on full string size 4) using ByteString for hash indexes would make lookups much faster -- Best regards, Bulat mailto:[EMAIL PROTECTED] _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe