Something like that should definitely be included in Data.List. Thanks for working on it.
Roman * Niklas Hambüchen <m...@nh2.me> [2013-07-14 19:20:52+0800] > tldr: nub is abnormally slow, we shouldn't use it, but we do. > > > As you might know, Data.List.nub is O(n²). (*) > > As you might not know, almost *all* practical Haskell projects use it, > and that in places where an Ord instance is given, e.g. happy, Xmonad, > ghc-mod, Agda, darcs, QuickCheck, yesod, shake, Cabal, haddock, and 600 > more (see https://github.com/nh2/haskell-ordnub). > > I've taken the Ord-based O(n * log n) implementation from yi using a Set: > > ordNub :: (Ord a) => [a] -> [a] > ordNub l = go empty l > where > go _ [] = [] > go s (x:xs) = if x `member` s then go s xs > else x : go (insert x s) xs > > > and put benchmarks on > http://htmlpreview.github.io/?https://github.com/nh2/haskell-ordnub/blob/1f0a2c94a/report.html > (compare `nub` vs `ordNub`). > > `ordNub` is not only in a different complexity class, but even seems to > perform better than nub for very small numbers of actually different > list elements (that's the numbers before the benchmark names). > > (The benchmark also shows some other potential problem: Using a state > monad to keep the set instead of a function argument can be up to 20 > times slower. Should that happen?) > > What do you think about ordNub? > > I've seen a proposal from 5 years ago about adding a *sort*Nub function > started by Neil, but it just died. > > > (*) The mentioned complexity is for the (very common) worst case, in > which the number of different elements in the list grows with the list > (alias you don't have an N element list with always only 5 different > things inside). > > _______________________________________________ > Haskell-Cafe mailing list > Haskell-Cafe@haskell.org > http://www.haskell.org/mailman/listinfo/haskell-cafe _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe