Beginners Digest, Vol 45, Issue 26

beginners-request Wed, 21 Mar 2012 03:08:41 -0700

Send Beginners mailing list submissions to
        beginners@haskell.org

To subscribe or unsubscribe via the World Wide Web, visit
        http://www.haskell.org/mailman/listinfo/beginners
or, via email, send a message with subject or body 'help' to
        beginners-requ...@haskell.org


You can reach the person managing the list at
        beginners-ow...@haskell.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Beginners digest..."


Today's Topics:

   1. Re:  Red-black tree performance (Adrien Haxaire)
   2.  Libexiv2, cabal and C++ (edgar klerks)
   3. Re:  Red-black tree performance (Heinrich Apfelmus)
   4. Re:  Red-black tree performance (Lorenzo Bolla)
   5. Re:  Red-black tree performance (Lorenzo Bolla)


----------------------------------------------------------------------

Message: 1
Date: Tue, 20 Mar 2012 22:55:57 +0100
From: Adrien Haxaire <adr...@haxaire.org>
Subject: Re: [Haskell-beginners] Red-black tree performance
To: beginners@haskell.org
Message-ID: <20120320215557.GB1426@arch>
Content-Type: text/plain; charset=us-ascii

On Tue, Mar 20, 2012 at 09:03:30PM +0000, Lorenzo Bolla wrote:
> On Tue, Mar 20, 2012 at 5:30 PM, Heinrich Apfelmus <
> apfel...@quantentunnel.de> wrote:
> > Huh, why foldr' and not Data.List.foldl' ? As far as I understand, the
> > former has to evaluate the whole spine of the list first before it can
> > begin to assemble the tree. The latter can start building trees right away.
> >
> >
> Yes, definitely foldl' is worth a try, too:
>  main = putStrLn $ show $ maxTree $ foldl' (flip insert) Empty toInsert
> 
> L.

I tried with foldl'. I modified the code at several places to match the 
argument pattern, and now I see why flip is so useful :) The conclusion is also 
interesting: the productivity climbs up to 92%, while the calculation time 
raises to 6.3s. I guess that the choice is space or time, as often.

Another interesting point. By using foldr' and removing the exclamation marks 
in the Tree data type, I even get better results: 3.76s. Which means a ratio of 
5.87. I tried adding some bang patterns and inline pragmas but it didn't help. 

I also ran the benchmark with 10 000 000 insertions and the ratio remains 
around 6. Which is quite acceptable, right? I'll be glad to show those results 
tomorrow.

I wouldn't imagine that GHC would optimize so well plain Haskell code. The 
algorithm is beautiful, its implementation too, and it produces fast code. Dear 
it's so great to learn Haskell!

Thank you Lorenzo and Heinrich, I have learnt a lot today. 

Best regards,
Adrien

-- 
Adrien Haxaire 
www.adrienhaxaire.org | @adrienhaxaire



------------------------------

Message: 2
Date: Wed, 21 Mar 2012 09:32:15 +0100
From: edgar klerks <edgar.kle...@gmail.com>
Subject: [Haskell-beginners] Libexiv2, cabal and C++
To: beginners <beginners@haskell.org>
Message-ID:
        <cagauytnj_pghyqcqauajwc_0j2awtkacmrjzbogjf7rnygt...@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Dear Haskellers,

I wrote a simple FFI binding to libExiv2 (
https://bitbucket.org/eklerks/libexiv2bindings/src), which is a library for
reading exif tags from image files.
I want to use cabal as build system, but I don't have a clue how to use it
in this case. Normally it works like a charm.

But because I have to first create a c-wrapper library around the C++ code,
it becomes very complicated.

IDoes anyone have some good resources I can look into?

With kind regards,

Edgar Klerks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://www.haskell.org/pipermail/beginners/attachments/20120321/a60d11ea/attachment-0001.htm>

------------------------------

Message: 3
Date: Wed, 21 Mar 2012 10:27:15 +0100
From: Heinrich Apfelmus <apfel...@quantentunnel.de>
Subject: Re: [Haskell-beginners] Red-black tree performance
To: beginners@haskell.org
Message-ID: <jkc6tk$uoc$1...@dough.gmane.org>
Content-Type: text/plain; charset=UTF-8; format=flowed

Adrien Haxaire wrote:
> I tried with foldl'. I modified the code at several places to match
> the argument pattern, and now I see why flip is so useful :) The
> conclusion is also interesting: the productivity climbs up to 92%,
> while the calculation time raises to 6.3s. I guess that the choice is
> space or time, as often.

92% productivity seems right for me. In contrast, 20% garbage collection 
may be a sign that something went wrong.

> Another interesting point. By using foldr' and removing the
> exclamation marks in the Tree data type, I even get better results:
> 3.76s. Which means a ratio of 5.87. I tried adding some bang patterns
> and inline pragmas but it didn't help.

I think that this is likely due to laziness: in the very end, you only 
query the rightmost element. After a while, the program simply won't 
evaluate the balancing on the left side of the tree, as you're not 
asking it to evaluate anything there.

So, you're not necessarily comparing apples and apples here. But on the 
other hand, maybe that's a performance disadvantage of the C++ version. 
In Haskell, performance depends a lot on usage patterns.


Best regards,
Heinrich Apfelmus

--
http://apfelmus.nfshost.com




------------------------------

Message: 4
Date: Wed, 21 Mar 2012 10:02:49 +0000
From: Lorenzo Bolla <lbo...@gmail.com>
Subject: Re: [Haskell-beginners] Red-black tree performance
To: beginners@haskell.org
Message-ID:
        <cadjgtrzpdxadh-wxigkehzjfa7+d3jdweauhvzcd+67f077...@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

On Wed, Mar 21, 2012 at 9:27 AM, Heinrich Apfelmus <
apfel...@quantentunnel.de> wrote:

> Adrien Haxaire wrote:
>
>> I tried with foldl'. I modified the code at several places to match
>> the argument pattern, and now I see why flip is so useful :) The
>> conclusion is also interesting: the productivity climbs up to 92%,
>> while the calculation time raises to 6.3s. I guess that the choice is
>> space or time, as often.
>>
>
> 92% productivity seems right for me. In contrast, 20% garbage collection
> may be a sign that something went wrong.

I think that this is likely due to laziness: in the very end, you only
> query the rightmost element. After a while, the program simply won't
> evaluate the balancing on the left side of the tree, as you're not asking
> it to evaluate anything there.
>
> So, you're not necessarily comparing apples and apples here. But on the
> other hand, maybe that's a performance disadvantage of the C++ version. In
> Haskell, performance depends a lot on usage patterns.
>
>
This is very true.
In fact, after some tweaking, I found that the best solution is using
foldl', lazy type and force some strictness in "insert" using "seq". See
below:

import Data.Foldable (foldl', foldr')

data Color = Red | Black deriving (Show)

data Tree a = Empty | Node Color (Tree a) a (Tree a)
              deriving (Show)

insert :: Ord a => a -> Tree a -> Tree a
insert x t = makeBlack (ins t)
             where
               ins Empty = Node Red Empty x Empty
               --  ins (Node color a y b) | x < y  = ins a `seq` balance
color (ins a) y b
               --                         | x == y = Node color a y b
               --                         | x > y  = ins b `seq` balance
color a y (ins b)
               ins (Node color a y b) | x < y  = balance color (ins a) y b
                                      | x == y = Node color a y b
                                      | x > y  = balance color a y (ins b)

makeBlack :: Tree a -> Tree a
makeBlack (Node _ a y b) = Node Black a y b
makeBlack Empty = Empty

balance :: Color -> Tree a -> a -> Tree a -> Tree a
balance Black (Node Red (Node Red a x b) y c) z d = Node Red (Node Black a
x b) y (Node Black c z d)
balance Black (Node Red a x (Node Red b y c)) z d = Node Red (Node Black a
x b) y (Node Black c z d)
balance Black a x (Node Red (Node Red b y c) z d) = Node Red (Node Black a
x b) y (Node Black c z d)
balance Black a x (Node Red b y (Node Red c z d)) = Node Red (Node Black a
x b) y (Node Black c z d)
balance color a x b = Node color a x b

maxTree :: Ord a => Tree a -> a
maxTree (Node _ Empty n Empty) = n
maxTree (Node _ _ _ t) = maxTree t

toInsert :: [Int]
--  toInsert = [1..1000000]
toInsert = map (`mod` 100) [1..10000000]

main :: IO ()
main = putStrLn $ show $ maxTree $ foldl' (flip insert) Empty toInsert


Note that if the improvement is around 10% for "toInsert" being a monotonic
sequence of integers, the improvement is much bigger (>2x for me) for a
more "random" "toInsert" sequence.

L.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://www.haskell.org/pipermail/beginners/attachments/20120321/4d3be6c9/attachment-0001.htm>

------------------------------

Message: 5
Date: Wed, 21 Mar 2012 10:08:06 +0000
From: Lorenzo Bolla <lbo...@gmail.com>
Subject: Re: [Haskell-beginners] Red-black tree performance
To: beginners@haskell.org
Message-ID:
        <CADjgTRw5wxf5JBy-f+ZALN53QUj=jnsw7rdh2r9+gs7gwjm...@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Ops, wrong copy and paste!
See correct script below:

import Data.Foldable (foldl')

data Color = Red | Black deriving (Show)

data Tree a = Empty | Node Color (Tree a) a (Tree a)
              deriving (Show)

insert :: Ord a => a -> Tree a -> Tree a
insert x t = makeBlack (ins t)
             where
               ins Empty = Node Red Empty x Empty
               ins (Node color a y b) | x < y  = ins a `seq` balance color
(ins a) y b
                                      | x == y = Node color a y b
                                      | x > y  = ins b `seq` balance color
a y (ins b)

makeBlack :: Tree a -> Tree a
makeBlack (Node _ a y b) = Node Black a y b
makeBlack Empty = Empty

balance :: Color -> Tree a -> a -> Tree a -> Tree a
balance Black (Node Red (Node Red a x b) y c) z d = Node Red (Node Black a
x b) y (Node Black c z d)
balance Black (Node Red a x (Node Red b y c)) z d = Node Red (Node Black a
x b) y (Node Black c z d)
balance Black a x (Node Red (Node Red b y c) z d) = Node Red (Node Black a
x b) y (Node Black c z d)
balance Black a x (Node Red b y (Node Red c z d)) = Node Red (Node Black a
x b) y (Node Black c z d)
balance color a x b = Node color a x b

maxTree :: Ord a => Tree a -> a
maxTree (Node _ Empty n Empty) = n
maxTree (Node _ _ _ t) = maxTree t

toInsert :: [Int]
--  toInsert = [1..1000000]
toInsert = map (`mod` 100) [1..10000000]

main :: IO ()
main = putStrLn $ show $ maxTree $ foldl' (flip insert) Empty toInsert


Sorry for the noise,
L.




On Wed, Mar 21, 2012 at 10:02 AM, Lorenzo Bolla <lbo...@gmail.com> wrote:

>
>
> On Wed, Mar 21, 2012 at 9:27 AM, Heinrich Apfelmus <
> apfel...@quantentunnel.de> wrote:
>
>> Adrien Haxaire wrote:
>>
>>> I tried with foldl'. I modified the code at several places to match
>>> the argument pattern, and now I see why flip is so useful :) The
>>> conclusion is also interesting: the productivity climbs up to 92%,
>>> while the calculation time raises to 6.3s. I guess that the choice is
>>> space or time, as often.
>>>
>>
>> 92% productivity seems right for me. In contrast, 20% garbage collection
>> may be a sign that something went wrong.
>
> I think that this is likely due to laziness: in the very end, you only
>> query the rightmost element. After a while, the program simply won't
>> evaluate the balancing on the left side of the tree, as you're not asking
>> it to evaluate anything there.
>>
>>
>> So, you're not necessarily comparing apples and apples here. But on the
>> other hand, maybe that's a performance disadvantage of the C++ version. In
>> Haskell, performance depends a lot on usage patterns.
>>
>>
> This is very true.
> In fact, after some tweaking, I found that the best solution is using
> foldl', lazy type and force some strictness in "insert" using "seq". See
> below:
>
> import Data.Foldable (foldl', foldr')
>
> data Color = Red | Black deriving (Show)
>
> data Tree a = Empty | Node Color (Tree a) a (Tree a)
>               deriving (Show)
>
> insert :: Ord a => a -> Tree a -> Tree a
> insert x t = makeBlack (ins t)
>              where
>                ins Empty = Node Red Empty x Empty
>                --  ins (Node color a y b) | x < y  = ins a `seq` balance
> color (ins a) y b
>                --                         | x == y = Node color a y b
>                --                         | x > y  = ins b `seq` balance
> color a y (ins b)
>                ins (Node color a y b) | x < y  = balance color (ins a) y b
>                                       | x == y = Node color a y b
>                                       | x > y  = balance color a y (ins b)
>
> makeBlack :: Tree a -> Tree a
> makeBlack (Node _ a y b) = Node Black a y b
> makeBlack Empty = Empty
>
> balance :: Color -> Tree a -> a -> Tree a -> Tree a
> balance Black (Node Red (Node Red a x b) y c) z d = Node Red (Node Black a
> x b) y (Node Black c z d)
> balance Black (Node Red a x (Node Red b y c)) z d = Node Red (Node Black a
> x b) y (Node Black c z d)
> balance Black a x (Node Red (Node Red b y c) z d) = Node Red (Node Black a
> x b) y (Node Black c z d)
> balance Black a x (Node Red b y (Node Red c z d)) = Node Red (Node Black a
> x b) y (Node Black c z d)
> balance color a x b = Node color a x b
>
> maxTree :: Ord a => Tree a -> a
> maxTree (Node _ Empty n Empty) = n
> maxTree (Node _ _ _ t) = maxTree t
>
> toInsert :: [Int]
> --  toInsert = [1..1000000]
> toInsert = map (`mod` 100) [1..10000000]
>
> main :: IO ()
> main = putStrLn $ show $ maxTree $ foldl' (flip insert) Empty toInsert
>
>
> Note that if the improvement is around 10% for "toInsert" being a
> monotonic sequence of integers, the improvement is much bigger (>2x for me)
> for a more "random" "toInsert" sequence.
>
> L.
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://www.haskell.org/pipermail/beginners/attachments/20120321/e3259421/attachment.htm>

------------------------------

_______________________________________________
Beginners mailing list
Beginners@haskell.org
http://www.haskell.org/mailman/listinfo/beginners


End of Beginners Digest, Vol 45, Issue 26
*****************************************

Beginners Digest, Vol 45, Issue 26

Reply via email to