Re: [Haskell-cafe] (flawed?) benchmark : sort
I get it now, thanks. Also, I guess it is possible to find a better algorithm for standard library sort. Christopher Skrzętnicki On Wed, Mar 5, 2008 at 12:04 AM, Chaddaï Fouché <[EMAIL PROTECTED]> wrote: > 2008/3/4, Krzysztof Skrzętnicki <[EMAIL PROTECTED]>: > > Thanks for improved code. My point was to measure which programming > patterns > > are faster than the others so I can learn which ones I should use. > However, > > the thing that is really bad is the fact, that even oneliner qsort_i is > > faster than library sort. Which is very different from what I've > expected. > > My intuition is only best and fastest code goes to library, to the point > > that people can learn from it. It seems I was mislead. > > I think you did not correctly got the point of my and Neil Mitchell's > message : you benchmarked those function on a completely random > sequences so qsort was at his best, but in the real world, most > sequences would have bias, and it is not infrequent at all to sort a > partially sorted (or reverse sorted) list... In this case the > performance of all your qsort are abysmal... Which is the reason the > old sort was replaced by the actual mergesort in the library. Try my > code (with 1 elements for example), you'll see that sort is the > best on a sorted list, and that qsort is at least 60 times slower (on > 1, in fact it is degenerating in O(n^2)). > > In the real world, the library maintainers decided it was ok to pay a > slight overhead in the case where the list to sort is really randomly > distributed since mergesort won so hugely over qsort in the pretty > frequent case (in programs) of lists which present regularities. > > There is no sort which is ideal in all situations, but we can try to > get a sort that works well in all situations, and don't trash in > situations not so infrequent. > > (On the other hand, don't expect libraries functions to always be the > best to use in your particular situation, they tend to be all-purpose > as we just saw and the maintainers prefer simple generic > implementations rather than complicated ones who could be slightly (or > even significantly) faster in some case) > > -- > Jedaï > ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] (flawed?) benchmark : sort
2008/3/4, Krzysztof Skrzętnicki <[EMAIL PROTECTED]>: > Thanks for improved code. My point was to measure which programming patterns > are faster than the others so I can learn which ones I should use. However, > the thing that is really bad is the fact, that even oneliner qsort_i is > faster than library sort. Which is very different from what I've expected. > My intuition is only best and fastest code goes to library, to the point > that people can learn from it. It seems I was mislead. I think you did not correctly got the point of my and Neil Mitchell's message : you benchmarked those function on a completely random sequences so qsort was at his best, but in the real world, most sequences would have bias, and it is not infrequent at all to sort a partially sorted (or reverse sorted) list... In this case the performance of all your qsort are abysmal... Which is the reason the old sort was replaced by the actual mergesort in the library. Try my code (with 1 elements for example), you'll see that sort is the best on a sorted list, and that qsort is at least 60 times slower (on 1, in fact it is degenerating in O(n^2)). In the real world, the library maintainers decided it was ok to pay a slight overhead in the case where the list to sort is really randomly distributed since mergesort won so hugely over qsort in the pretty frequent case (in programs) of lists which present regularities. There is no sort which is ideal in all situations, but we can try to get a sort that works well in all situations, and don't trash in situations not so infrequent. (On the other hand, don't expect libraries functions to always be the best to use in your particular situation, they tend to be all-purpose as we just saw and the maintainers prefer simple generic implementations rather than complicated ones who could be slightly (or even significantly) faster in some case) -- Jedaï ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] (flawed?) benchmark : sort
Hi > My intuition is only best and fastest code goes to library, to the point > that people can learn from it. It seems I was mislead. The compilers change over time - meaning that the fastest code may change over time. There is also the chance that the original code was not the best - for example the words function in the standard library performs two additional isSpace tests per word. The original code specifies an interface, its thanks to people trying to beat the performance that things improve. Thanks Neil ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] (flawed?) benchmark : sort
Thanks for improved code. My point was to measure which programming patterns are faster than the others so I can learn which ones I should use. However, the thing that is really bad is the fact, that even oneliner qsort_i is faster than library sort. Which is very different from what I've expected. My intuition is only best and fastest code goes to library, to the point that people can learn from it. It seems I was mislead. > It could probably be improved (with classics solution (better > selection of the pivot...)), but the mergesort is only 3 times slower > in worse case, and much more regular, if someone needs a faster sort > in a specific case, it isn't hard to code. > > -- > Jedaï > ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] (flawed?) benchmark : sort
Hi > -- Zadanie 9 - merge sort > mergeSort :: Ord a => [a] -> [a] > mergeSort []= [] > mergeSort [x] = [x] > mergeSort xs= let(l, r) = splitAt (length xs `quot` 2) xs > in mergeSortP (mergeSort l) (mergeSort r) splitAt is not a particularly good way to split a list, since you recurse over the list twice. Try instead: split (x:xs) = (x:b,a) where (a,b) = split xs split [] = ([], []) Perhaps adding some strictness annotations and turning the where into a case. Also remember that a standard benchmark for sorting is an ordered/reverse ordered list, as that causes quicksort to go to O(n^2) given a bad pivot choice. If the sort in the standard libraries can be improved on, it should be replaced. Thanks Neil ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] (flawed?) benchmark : sort
2008/3/4, Krzysztof Skrzętnicki <[EMAIL PROTECTED]>: > Hi > > I was playing with various versions of sorting algorithms. I know it's very > easy to create flawed benchmark and I don't claim those are good ones. > However, it really seems strange to me, that sort - library function - is > actually the worse measured function. I can hardly belive it, and I'd rather > say I have made a mistake preparing it. > > The overall winner seems to be qsort_iv - which is nothing less but old sort > replaced by mergesort now. > > Any clues? Part of what you may be missing : -- cut here -- module Main where import Control.Parallel.Strategies import Control.Arrow import System.CPUTime import System.IO import System.Environment import System.Random import Data.List( partition, sort ) data Tree a = Node (Tree a) a (Tree a) | Leaf qsort_i [] = [] qsort_i (x:xs) = qsort_i (filter (< x) xs) ++ [x] ++ qsort_i (filter (>= x) xs) qsort_ii [] = [] qsort_ii (x:xs) = let (ls,gt) = partition (< x) xs in qsort_ii ls ++ [x] ++ qsort_ii gt qsort_iii xs = qsort_iii' [] xs qsort_iii' acc [] = acc qsort_iii' acc (x:xs) = let (ls,gt) = partition (< x) xs in let acc' = (x:(qsort_iii' acc gt)) in qsort_iii' acc' ls qsort_v [] = [] qsort_v (x:xs) = let (xlt, xgt ) = foldl (\ (lt,gt) el -> case compare x el of GT -> (el:lt, gt) _ -> (lt, el:gt) ) ([],[]) xs in qsort_v xlt ++ [x] ++ qsort_v xgt -- zmodyfikowany i qsort_vi [] = [] qsort_vi (x:xs) = qsort_vi (filter (\y-> compare x y == GT) xs) ++ [x] ++ qsort_vi (filter (\y-> compare x y /= GT) xs) -- zmodyfikowany iii qsort_vii xs = qsort_vii' [] xs qsort_vii' acc [] = acc qsort_vii' acc (x:xs) = let (ls,gt) = partition (\y-> compare x y == GT) xs in let acc' = (x:(qsort_vii' acc gt)) in qsort_vii' acc' ls -- qsort is stable and does not concatenate. qsort_iv xs = qsort_iv' (compare) xs [] qsort_iv' _ [] r = r qsort_iv' _ [x]r = x:r qsort_iv' cmp (x:xs) r = qpart cmp x xs [] [] r -- qpart partitions and sorts the sublists qpart cmp x [] rlt rge r = -- rlt and rge are in reverse order and must be sorted with an -- anti-stable sorting rqsort_iv' cmp rlt (x:rqsort_iv' cmp rge r) qpart cmp x (y:ys) rlt rge r = case cmp x y of GT -> qpart cmp x ys (y:rlt) rge r _ -> qpart cmp x ys rlt (y:rge) r -- rqsort is as qsort but anti-stable, i.e. reverses equal elements rqsort_iv' _ [] r = r rqsort_iv' _ [x]r = x:r rqsort_iv' cmp (x:xs) r = rqpart cmp x xs [] [] r rqpart cmp x [] rle rgt r = qsort_iv' cmp rle (x:qsort_iv' cmp rgt r) rqpart cmp x (y:ys) rle rgt r = case cmp y x of GT -> rqpart cmp x ys rle (y:rgt) r _ -> rqpart cmp x ys (y:rle) rgt r -- code by Orcus -- Zadanie 9 - merge sort mergeSort :: Ord a => [a] -> [a] mergeSort []= [] mergeSort [x] = [x] mergeSort xs= let(l, r) = splitAt (length xs `quot` 2) xs in mergeSortP (mergeSort l) (mergeSort r) -- funkcja pomocnicza scalajÄ…ca dwie listy uporzÄ…dkowane w jednÄ… mergeSortP :: Ord a => [a] -> [a] -> [a] mergeSortP xs []= xs mergeSortP [] ys= ys mergeSortP (x:xs) (y:ys) | x <= y = x:(mergeSortP xs (y:ys)) | otherwise = y:(mergeSortP (x:xs) ys) -- Zadanie 10 - tree sort treeSort :: Ord a => [a] -> [a] -- pointless po raz drugi treeSort = (treeSortInorder . treeSortToTree) treeSortToTree :: Ord a => [a] -> Tree a treeSortToTree [] = Leaf treeSortToTree (x:xs) = let (xlt, xgt) = foldl (\ (lt,gt) el -> case compare x el of GT -> (el:lt, gt) _ -> (lt, el:gt) ) ([],[]) xs in Node (treeSortToTree xlt) x (treeSortToTree xgt) treeSortInorder :: Ord a => Tree a -> [a] treeSortInorder Leaf= [] treeSortInorder (Node l x r)= (treeSortInorder l) ++ [x] ++ (treeSortInorder r) -- end code by Orcus -- begin benchmark making code makeBenchs benchs xs = do let (funcNames, funcs) = unzip benchs tBegin <- getCPUTime timers <- mapM (\f-> print (f xs) >> getCPUTime) funcs let times = zipWith (-) timers (tBegin:timers) sortedResults = sort $ zip times funcNames minT = fromIntegral $ fst $ head sortedResults scaled = map (((/minT) . fromIntegral) *** id) sortedResults hPutStr stderr $ unlines $ map show scaled onRandom eltCnt = do gen <- getStdGen let xs = take eltCnt (randomRs (1::Int, bigNum) gen) `using` rnf xs `seq` return xs onSorted eltCnt = do gen <- getStdGen let xs = take eltCnt (randomRs (1::Int, bigNum) gen) `using` rnf sxs = sort xs `using` rnf xs `seq` sxs `seq` return sxs bigNum = 100 :: Int -- end of benchmark making code main = makeBenchs [("i",qsort_i), ("ii",qsort_ii), ("iii",qsort
[Haskell-cafe] (flawed?) benchmark : sort
Hi I was playing with various versions of sorting algorithms. I know it's very easy to create flawed benchmark and I don't claim those are good ones. However, it really seems strange to me, that sort - library function - is actually the worse measured function. I can hardly belive it, and I'd rather say I have made a mistake preparing it. The overall winner seems to be qsort_iv - which is nothing less but old sort replaced by mergesort now. Any clues? Regards Christopher Skrzętnicki --- cut here --- [EMAIL PROTECTED] haskell]$ ghc -O2 --make qsort.hs && ./qsort +RTS -sstderr -RTS > /dev/null [1 of 1] Compiling Main ( qsort.hs, qsort.o ) Linking qsort ... ./qsort +RTS -sstderr (1.0,"iv") (1.1896770400256864,"v") (1.3091609772011856,"treeSort") (1.592515715933153,"vii") (1.5953543402198838,"vi") (1.5961286512637272,"iii") (1.8175480563244177,"i") (1.8771604568641642,"ii") (2.453160634439497,"mergeSort") (2.6627090768870216,"sort") 26,094,674,624 bytes allocated in the heap 12,716,656,224 bytes copied during GC (scavenged) 2,021,104,592 bytes copied during GC (not scavenged) 107,225,088 bytes maximum residency (140 sample(s)) 49773 collections in generation 0 ( 21.76s) 140 collections in generation 1 ( 23.61s) 305 Mb total memory in use INIT time0.00s ( 0.00s elapsed) MUT time 20.39s ( 20.74s elapsed) GCtime 45.37s ( 46.22s elapsed) EXIT time0.00s ( 0.00s elapsed) Total time 65.76s ( 66.96s elapsed) %GC time 69.0% (69.0% elapsed) Alloc rate1,279,723,644 bytes per MUT second Productivity 31.0% of total user, 30.5% of total elapsed --- cut here --- {-# OPTIONS_GHC -O2 #-} module Main where import System.CPUTime import System.IO import System.Environment import System.Random import Data.List( partition, sort ) data Tree a = Node (Tree a) a (Tree a) | Leaf qsort_i [] = [] qsort_i (x:xs) = qsort_i (filter (< x) xs) ++ [x] ++ qsort_i (filter (>= x) xs) qsort_ii [] = [] qsort_ii (x:xs) = let (ls,gt) = partition (< x) xs in qsort_ii ls ++ [x] ++ qsort_ii gt qsort_iii xs = qsort_iii' [] xs qsort_iii' acc [] = acc qsort_iii' acc (x:xs) = let (ls,gt) = partition (< x) xs in let acc' = (x:(qsort_iii' acc gt)) in qsort_iii' acc' ls qsort_v [] = [] qsort_v (x:xs) = let (xlt, xgt ) = foldl (\ (lt,gt) el -> case compare x el of GT -> (el:lt, gt) _ -> (lt, el:gt) ) ([],[]) xs in qsort_v xlt ++ [x] ++ qsort_v xgt -- zmodyfikowany i qsort_vi [] = [] qsort_vi (x:xs) = qsort_vi (filter (\y-> compare x y == GT) xs) ++ [x] ++ qsort_vi (filter (\y-> compare x y /= GT) xs) -- zmodyfikowany iii qsort_vii xs = qsort_vii' [] xs qsort_vii' acc [] = acc qsort_vii' acc (x:xs) = let (ls,gt) = partition (\y-> compare x y == GT) xs in let acc' = (x:(qsort_vii' acc gt)) in qsort_vii' acc' ls -- qsort is stable and does not concatenate. qsort_iv xs = qsort_iv' (compare) xs [] qsort_iv' _ [] r = r qsort_iv' _ [x]r = x:r qsort_iv' cmp (x:xs) r = qpart cmp x xs [] [] r -- qpart partitions and sorts the sublists qpart cmp x [] rlt rge r = -- rlt and rge are in reverse order and must be sorted with an -- anti-stable sorting rqsort_iv' cmp rlt (x:rqsort_iv' cmp rge r) qpart cmp x (y:ys) rlt rge r = case cmp x y of GT -> qpart cmp x ys (y:rlt) rge r _ -> qpart cmp x ys rlt (y:rge) r -- rqsort is as qsort but anti-stable, i.e. reverses equal elements rqsort_iv' _ [] r = r rqsort_iv' _ [x]r = x:r rqsort_iv' cmp (x:xs) r = rqpart cmp x xs [] [] r rqpart cmp x [] rle rgt r = qsort_iv' cmp rle (x:qsort_iv' cmp rgt r) rqpart cmp x (y:ys) rle rgt r = case cmp y x of GT -> rqpart cmp x ys rle (y:rgt) r _ -> rqpart cmp x ys (y:rle) rgt r -- code by Orcus -- Zadanie 9 - merge sort mergeSort :: Ord a => [a] -> [a] mergeSort []= [] mergeSort [x] = [x] mergeSort xs= let(l, r) = splitAt (length xs `quot` 2) xs in mergeSortP (mergeSort l) (mergeSort r) -- funkcja pomocnicza scalajÄ…ca dwie listy uporzÄ…dkowane w jednÄ… mergeSortP :: Ord a => [a] -> [a] -> [a] mergeSortP xs []= xs mergeSortP [] ys= ys mergeSortP (x:xs) (y:ys) | x <= y = x:(mergeSortP xs (y:ys)) | otherwise = y:(mergeSortP (x:xs) ys) -- Zadanie 10 - tree sort treeSort :: Ord a => [a] -> [a] -- pointless po raz drugi treeSort = (treeSortInorder . treeSortToTree) treeSortToTree :: Ord a => [a] -> Tree a treeSortToTree [] = Leaf treeSortToTree (x:xs) = let (xlt, xgt) = foldl (\ (lt,gt) el -> case compare x el of GT -> (el:lt, gt) _ -> (lt, el:gt) ) ([],[]) xs in Node (treeSortToTree xlt) x (treeSortToTree xgt) treeSortInorder :: Or