Re: [Haskell-cafe] (flawed?) benchmark : sort

2008-03-04 Thread Krzysztof Skrzętnicki
I get it now, thanks. Also, I guess it is possible to find a better
algorithm for standard library sort.

Christopher Skrzętnicki

On Wed, Mar 5, 2008 at 12:04 AM, Chaddaï Fouché <[EMAIL PROTECTED]>
wrote:

> 2008/3/4, Krzysztof Skrzętnicki <[EMAIL PROTECTED]>:
> > Thanks for improved code. My point was to measure which programming
> patterns
> > are faster than the others so I can learn which ones I should use.
> However,
> > the thing that is really bad is the fact, that even oneliner qsort_i is
> > faster than library sort. Which is very different from what I've
> expected.
> > My intuition is only best and fastest code goes to library, to the point
> > that people can learn from it. It seems I was mislead.
>
> I think you did not correctly got the point of my and Neil Mitchell's
> message : you benchmarked those function on a completely random
> sequences so qsort was at his best, but in the real world, most
> sequences would have bias, and it is not infrequent at all to sort a
> partially sorted (or reverse sorted) list... In this case the
> performance of all your qsort are abysmal... Which is the reason the
> old sort was replaced by the actual mergesort in the library. Try my
> code (with 1 elements for example), you'll see that sort is the
> best on a sorted list, and that qsort is at least 60 times slower (on
> 1, in fact it is degenerating in O(n^2)).
>
> In the real world, the library maintainers decided it was ok to pay a
> slight overhead in the case where the list to sort is really randomly
> distributed since mergesort won so hugely over qsort in the pretty
> frequent case (in programs) of lists which present regularities.
>
> There is no sort which is ideal in all situations, but we can try to
> get a sort that works well in all situations, and don't trash in
> situations not so infrequent.
>
> (On the other hand, don't expect libraries functions to always be the
> best to use in your particular situation, they tend to be all-purpose
> as we just saw and the maintainers prefer simple generic
> implementations rather than complicated ones who could be slightly (or
> even significantly) faster in some case)
>
> --
> Jedaï
>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] (flawed?) benchmark : sort

2008-03-04 Thread Chaddaï Fouché
2008/3/4, Krzysztof Skrzętnicki <[EMAIL PROTECTED]>:
> Thanks for improved code. My point was to measure which programming patterns
> are faster than the others so I can learn which ones I should use. However,
> the thing that is really bad is the fact, that even oneliner qsort_i is
> faster than library sort. Which is very different from what I've expected.
> My intuition is only best and fastest code goes to library, to the point
> that people can learn from it. It seems I was mislead.

I think you did not correctly got the point of my and Neil Mitchell's
message : you benchmarked those function on a completely random
sequences so qsort was at his best, but in the real world, most
sequences would have bias, and it is not infrequent at all to sort a
partially sorted (or reverse sorted) list... In this case the
performance of all your qsort are abysmal... Which is the reason the
old sort was replaced by the actual mergesort in the library. Try my
code (with 1 elements for example), you'll see that sort is the
best on a sorted list, and that qsort is at least 60 times slower (on
1, in fact it is degenerating in O(n^2)).

In the real world, the library maintainers decided it was ok to pay a
slight overhead in the case where the list to sort is really randomly
distributed since mergesort won so hugely over qsort in the pretty
frequent case (in programs) of lists which present regularities.

There is no sort which is ideal in all situations, but we can try to
get a sort that works well in all situations, and don't trash in
situations not so infrequent.

(On the other hand, don't expect libraries functions to always be the
best to use in your particular situation, they tend to be all-purpose
as we just saw and the maintainers prefer simple generic
implementations rather than complicated ones who could be slightly (or
even significantly) faster in some case)

-- 
Jedaï
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] (flawed?) benchmark : sort

2008-03-04 Thread Neil Mitchell
Hi

> My intuition is only best and fastest code goes to library, to the point
> that people can learn from it. It seems I was mislead.

The compilers change over time - meaning that the fastest code may
change over time. There is also the chance that the original code was
not the best - for example the words function in the standard library
performs two additional isSpace tests per word. The original code
specifies an interface, its thanks to people trying to beat the
performance that things improve.

Thanks

Neil
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] (flawed?) benchmark : sort

2008-03-04 Thread Krzysztof Skrzętnicki
Thanks for improved code. My point was to measure which programming patterns
are faster than the others so I can learn which ones I should use. However,
the thing that is really bad is the fact, that even oneliner qsort_i is
faster than library sort. Which is very different from what I've expected.
My intuition is only best and fastest code goes to library, to the point
that people can learn from it. It seems I was mislead.


> It could probably be improved (with classics solution (better
> selection of the pivot...)), but the mergesort is only 3 times slower
> in worse case, and much more regular, if someone needs a faster sort
> in a specific case, it isn't hard to code.
>
> --
> Jedaï
>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] (flawed?) benchmark : sort

2008-03-04 Thread Neil Mitchell
Hi

>  -- Zadanie 9 - merge sort
>  mergeSort :: Ord a => [a] -> [a]
>  mergeSort []= []
>  mergeSort [x]   = [x]
>  mergeSort xs= let(l, r) = splitAt (length xs `quot` 2) xs
>   in mergeSortP (mergeSort l) (mergeSort r)

splitAt is not a particularly good way to split a list, since you
recurse over the list twice. Try instead:

split (x:xs) = (x:b,a)
where (a,b) = split xs
split [] = ([], [])

Perhaps adding some strictness annotations and turning the where into a case.

Also remember that a standard benchmark for sorting is an
ordered/reverse ordered list, as that causes quicksort to go to O(n^2)
given a bad pivot choice.

If the sort in the standard libraries can be improved on, it should be replaced.

Thanks

Neil
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] (flawed?) benchmark : sort

2008-03-04 Thread Chaddaï Fouché
2008/3/4, Krzysztof Skrzętnicki <[EMAIL PROTECTED]>:
> Hi
>
> I was playing with various versions of sorting algorithms. I know it's very
> easy to create flawed benchmark and I don't claim those are good ones.
> However, it really seems strange to me, that sort - library function - is
> actually the worse measured function. I can hardly belive it, and I'd rather
> say I have made a mistake preparing it.
>
> The overall winner seems to be qsort_iv - which is nothing less but old sort
> replaced by mergesort now.
>
> Any clues?

Part of what you may be missing :
-- cut here --
module Main where

import Control.Parallel.Strategies
import Control.Arrow
import System.CPUTime
import System.IO
import System.Environment
import System.Random
import Data.List( partition, sort )

data Tree a =
Node (Tree a) a (Tree a)
| Leaf


qsort_i []  = []
qsort_i (x:xs) = qsort_i (filter (< x) xs) ++ [x] ++ qsort_i (filter (>= x) xs)

qsort_ii [] = []
qsort_ii (x:xs) = let (ls,gt) = partition (< x) xs in qsort_ii ls ++
[x] ++ qsort_ii gt

qsort_iii xs = qsort_iii' [] xs
qsort_iii' acc [] = acc
qsort_iii' acc (x:xs) =
let (ls,gt) = partition (< x) xs in
let acc' = (x:(qsort_iii' acc gt)) in qsort_iii' acc' ls

qsort_v [] = []
qsort_v (x:xs) = let (xlt, xgt ) = foldl (\ (lt,gt) el -> case compare x el of
GT -> (el:lt, gt)
_  -> (lt,
el:gt) ) ([],[]) xs
 in qsort_v xlt ++ [x] ++ qsort_v xgt

-- zmodyfikowany i
qsort_vi [] = []
qsort_vi (x:xs) = qsort_vi (filter (\y-> compare x y == GT) xs) ++ [x]
++ qsort_vi (filter (\y-> compare x y /= GT) xs)


-- zmodyfikowany iii
qsort_vii xs = qsort_vii' [] xs
qsort_vii' acc [] = acc
qsort_vii' acc (x:xs) =
let (ls,gt) = partition (\y-> compare x y == GT) xs in
let acc' = (x:(qsort_vii' acc gt)) in qsort_vii' acc' ls



-- qsort is stable and does not concatenate.
qsort_iv xs = qsort_iv' (compare) xs []

qsort_iv' _   [] r = r
qsort_iv' _   [x]r = x:r
qsort_iv' cmp (x:xs) r = qpart cmp x xs [] [] r

-- qpart partitions and sorts the sublists
qpart cmp x [] rlt rge r =
-- rlt and rge are in reverse order and must be sorted with an
-- anti-stable sorting
rqsort_iv' cmp rlt (x:rqsort_iv' cmp rge r)
qpart cmp x (y:ys) rlt rge r =
case cmp x y of
GT -> qpart cmp x ys (y:rlt) rge r
_  -> qpart cmp x ys rlt (y:rge) r

-- rqsort is as qsort but anti-stable, i.e. reverses equal elements
rqsort_iv' _   [] r = r
rqsort_iv' _   [x]r = x:r
rqsort_iv' cmp (x:xs) r = rqpart cmp x xs [] [] r

rqpart cmp x [] rle rgt r =
qsort_iv' cmp rle (x:qsort_iv' cmp rgt r)
rqpart cmp x (y:ys) rle rgt r =
case cmp y x of
GT -> rqpart cmp x ys rle (y:rgt) r
_  -> rqpart cmp x ys (y:rle) rgt r


-- code by Orcus

-- Zadanie 9 - merge sort
mergeSort :: Ord a => [a] -> [a]
mergeSort []= []
mergeSort [x]   = [x]
mergeSort xs= let(l, r) = splitAt (length xs `quot` 2) xs
  in mergeSortP (mergeSort l) (mergeSort r)

-- funkcja pomocnicza scalajÄ…ca dwie listy uporzÄ…dkowane w jednÄ…
mergeSortP :: Ord a => [a] -> [a] -> [a]
mergeSortP xs []= xs
mergeSortP [] ys= ys
mergeSortP (x:xs) (y:ys)
| x <= y = x:(mergeSortP xs (y:ys))
| otherwise = y:(mergeSortP (x:xs) ys)

-- Zadanie 10 - tree sort
treeSort :: Ord a => [a] -> [a]
-- pointless po raz drugi
treeSort = (treeSortInorder . treeSortToTree)

treeSortToTree :: Ord a => [a] -> Tree a
treeSortToTree []   = Leaf
treeSortToTree (x:xs)   = let (xlt, xgt) = foldl (\ (lt,gt) el -> case
compare x el of
GT -> (el:lt, gt)
_  -> (lt,
el:gt) ) ([],[]) xs
  in Node (treeSortToTree xlt) x (treeSortToTree xgt)

treeSortInorder :: Ord a => Tree a -> [a]
treeSortInorder Leaf= []
treeSortInorder (Node l x r)= (treeSortInorder l) ++ [x] ++
(treeSortInorder r)

-- end code by Orcus

-- begin benchmark making code

makeBenchs benchs xs = do
  let (funcNames, funcs) = unzip benchs
  tBegin <- getCPUTime
  timers <- mapM (\f-> print (f xs) >> getCPUTime) funcs
  let times = zipWith (-) timers (tBegin:timers)
  sortedResults = sort $ zip times funcNames
  minT = fromIntegral $ fst $ head sortedResults
  scaled = map (((/minT) . fromIntegral) *** id) sortedResults
  hPutStr stderr $ unlines $ map show scaled

onRandom eltCnt = do
  gen <- getStdGen
  let xs = take eltCnt (randomRs (1::Int, bigNum) gen) `using` rnf
  xs `seq` return xs

onSorted eltCnt = do
  gen <- getStdGen
  let xs = take eltCnt (randomRs (1::Int, bigNum) gen) `using` rnf
  sxs = sort xs `using` rnf
  xs `seq` sxs `seq` return sxs

bigNum = 100 :: Int

-- end of benchmark making code

main = makeBenchs [("i",qsort_i),
   ("ii",qsort_ii),
   ("iii",qsort

[Haskell-cafe] (flawed?) benchmark : sort

2008-03-03 Thread Krzysztof Skrzętnicki
Hi

I was playing with various versions of sorting algorithms. I know it's very
easy to create flawed benchmark and I don't claim those are good ones.
However, it really seems strange to me, that sort - library function - is
actually the worse measured function. I can hardly belive it, and I'd rather
say I have made a mistake preparing it.

The overall winner seems to be qsort_iv - which is nothing less but old sort
replaced by mergesort now.

Any clues?

Regards
Christopher Skrzętnicki

--- cut here ---
[EMAIL PROTECTED] haskell]$ ghc -O2 --make qsort.hs && ./qsort +RTS -sstderr
-RTS > /dev/null
[1 of 1] Compiling Main ( qsort.hs, qsort.o )
Linking qsort ...
./qsort +RTS -sstderr
(1.0,"iv")
(1.1896770400256864,"v")
(1.3091609772011856,"treeSort")
(1.592515715933153,"vii")
(1.5953543402198838,"vi")
(1.5961286512637272,"iii")
(1.8175480563244177,"i")
(1.8771604568641642,"ii")
(2.453160634439497,"mergeSort")
(2.6627090768870216,"sort")
26,094,674,624 bytes allocated in the heap
12,716,656,224 bytes copied during GC (scavenged)
2,021,104,592 bytes copied during GC (not scavenged)
107,225,088 bytes maximum residency (140 sample(s))

  49773 collections in generation 0 ( 21.76s)
140 collections in generation 1 ( 23.61s)

305 Mb total memory in use

  INIT  time0.00s  (  0.00s elapsed)
  MUT   time   20.39s  ( 20.74s elapsed)
  GCtime   45.37s  ( 46.22s elapsed)
  EXIT  time0.00s  (  0.00s elapsed)
  Total time   65.76s  ( 66.96s elapsed)

  %GC time  69.0%  (69.0% elapsed)

  Alloc rate1,279,723,644 bytes per MUT second

  Productivity  31.0% of total user, 30.5% of total elapsed


--- cut here ---

{-# OPTIONS_GHC -O2 #-}
module Main where

import System.CPUTime
import System.IO
import System.Environment
import System.Random
import Data.List( partition, sort )

data Tree a =
Node (Tree a) a (Tree a)
| Leaf


qsort_i []  = []
qsort_i (x:xs) = qsort_i (filter (< x) xs) ++ [x] ++ qsort_i (filter (>= x)
xs)

qsort_ii [] = []
qsort_ii (x:xs) = let (ls,gt) = partition (< x) xs in qsort_ii ls ++ [x] ++
qsort_ii gt

qsort_iii xs = qsort_iii' [] xs
qsort_iii' acc [] = acc
qsort_iii' acc (x:xs) =
let (ls,gt) = partition (< x) xs in
let acc' = (x:(qsort_iii' acc gt)) in qsort_iii' acc' ls

qsort_v [] = []
qsort_v (x:xs) = let (xlt, xgt ) = foldl (\ (lt,gt) el -> case compare x el
of
GT -> (el:lt,
gt)
_  -> (lt,
el:gt) ) ([],[]) xs
 in qsort_v xlt ++ [x] ++ qsort_v xgt

-- zmodyfikowany i
qsort_vi [] = []
qsort_vi (x:xs) = qsort_vi (filter (\y-> compare x y == GT) xs) ++ [x] ++
qsort_vi (filter (\y-> compare x y /= GT) xs)


-- zmodyfikowany iii
qsort_vii xs = qsort_vii' [] xs
qsort_vii' acc [] = acc
qsort_vii' acc (x:xs) =
let (ls,gt) = partition (\y-> compare x y == GT) xs in
let acc' = (x:(qsort_vii' acc gt)) in qsort_vii' acc' ls



-- qsort is stable and does not concatenate.
qsort_iv xs = qsort_iv' (compare) xs []

qsort_iv' _   [] r = r
qsort_iv' _   [x]r = x:r
qsort_iv' cmp (x:xs) r = qpart cmp x xs [] [] r

-- qpart partitions and sorts the sublists
qpart cmp x [] rlt rge r =
-- rlt and rge are in reverse order and must be sorted with an
-- anti-stable sorting
rqsort_iv' cmp rlt (x:rqsort_iv' cmp rge r)
qpart cmp x (y:ys) rlt rge r =
case cmp x y of
GT -> qpart cmp x ys (y:rlt) rge r
_  -> qpart cmp x ys rlt (y:rge) r

-- rqsort is as qsort but anti-stable, i.e. reverses equal elements
rqsort_iv' _   [] r = r
rqsort_iv' _   [x]r = x:r
rqsort_iv' cmp (x:xs) r = rqpart cmp x xs [] [] r

rqpart cmp x [] rle rgt r =
qsort_iv' cmp rle (x:qsort_iv' cmp rgt r)
rqpart cmp x (y:ys) rle rgt r =
case cmp y x of
GT -> rqpart cmp x ys rle (y:rgt) r
_  -> rqpart cmp x ys (y:rle) rgt r


-- code by Orcus

-- Zadanie 9 - merge sort
mergeSort :: Ord a => [a] -> [a]
mergeSort []= []
mergeSort [x]   = [x]
mergeSort xs= let(l, r) = splitAt (length xs `quot` 2) xs
  in mergeSortP (mergeSort l) (mergeSort r)

-- funkcja pomocnicza scalajÄ…ca dwie listy uporzÄ…dkowane w jednÄ…
mergeSortP :: Ord a => [a] -> [a] -> [a]
mergeSortP xs []= xs
mergeSortP [] ys= ys
mergeSortP (x:xs) (y:ys)
| x <= y = x:(mergeSortP xs (y:ys))
| otherwise = y:(mergeSortP (x:xs) ys)

-- Zadanie 10 - tree sort
treeSort :: Ord a => [a] -> [a]
-- pointless po raz drugi
treeSort = (treeSortInorder . treeSortToTree)

treeSortToTree :: Ord a => [a] -> Tree a
treeSortToTree []   = Leaf
treeSortToTree (x:xs)   = let (xlt, xgt) = foldl (\ (lt,gt) el -> case
compare x el of
GT -> (el:lt,
gt)
_  -> (lt,
el:gt) ) ([],[]) xs
  in Node (treeSortToTree xlt) x (treeSortToTree
xgt)

treeSortInorder :: Or