On Jun 22, 9:10 am, Ketil Malde <ke...@malde.org> wrote: > Kamil Dworakowski <ka...@dworakowski.name> writes: > > Right... Python uses hashtables while here I have a tree with log n > > access time. I did not want to use the Data.HashTable, it would > > pervade my program with IO. The alternative is an ideal hashmap that never > > gets changed. This program creates a dictionary at start which then is only > > being used to read from: an ideal application for the Data.PerfectHash > > by Mark Wotton available on Hackage [3]. > > If you are considering alternative data structures, you might want to > look at tries or Bloom filters, both have O(n) lookup, both have > Haskell implementations. The latter is probably faster but > probabilistic (i.e. it will occasionally fail to detect a > misspelling - which you can of course check against a "real" > dictionary).
Using Bryan O'Sullivan's fantastic BloomFilter I got it down below Python's run time! Now it is 35.56s, 28% of the time is spent on GC, which I think means there is still some room for improvement. Do you say that PerfectHash runs with a penalty of calling into c library? _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe