Beginners Digest, Vol 48, Issue 24

beginners-request Tue, 19 Jun 2012 15:02:06 -0700

Send Beginners mailing list submissions to
        beginners@haskell.org

To subscribe or unsubscribe via the World Wide Web, visit
        http://www.haskell.org/mailman/listinfo/beginners
or, via email, send a message with subject or body 'help' to
        beginners-requ...@haskell.org

You can reach the person managing the list at
        beginners-ow...@haskell.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Beginners digest..."

Today's Topics:

   1.  Improving Performance (Robert Heum?ller)
   2. Re:  General Hints on Improving Performance - Example DBSCAN
      (Robert Heum?ller)
   3. Re:  Improving Performance (Ertugrul S?ylemez)
   4. Re:  Improving Performance (Robert Heum?ller)
   5.  Understanding how laziness works (Matt Ford)
   6. Re:  Improving Performance (Robert Heum?ller)
   7. Re:  Improving Performance (Ertugrul S?ylemez)
   8. Re:  Understanding how laziness works (Alexander Batischev)
   9.  Function application versus function     composition performance
      (Matt Ford)

----------------------------------------------------------------------

Message: 1
Date: Tue, 19 Jun 2012 16:37:57 +0200
From: Robert Heum?ller <mail...@heum.de>
Subject: [Haskell-beginners] Improving Performance
To: beginners@haskell.org
Message-ID: <20120619163757.1c528917@thor>
Content-Type: text/plain; charset=US-ASCII

Hello,

in order to tune my brain to "functional thinking" I've decided to
start writing some "real" programs. After having spent a couple of days
figuring out how to translate an imperative algorithm into stateless
haskell I beleive I've now managed a simple implementation of the DBSCAN
clustering algorithm.

http://privatepaste.com/e6bb4fb665

This code has several drawbacks (and probably it doesn't work correctly
after all) but I would like to tune its performance.
Obviously the code calls several functions multiply with the same
arguments. Haskell, being stateless should thus always yield the same
resuts, correct? In this case performing the same calculations multiply
seems pointless.

My question: 
How would you go about improving this code? 
In any imperative language I would simply cache distances/neighborhoods
in a matrix. Obviously this is not what the way to go in haskell?

Thank you very much

------------------------------

Message: 2
Date: Tue, 19 Jun 2012 16:45:51 +0200
From: Robert Heum?ller <mail...@heum.de>
Subject: Re: [Haskell-beginners] General Hints on Improving
        Performance - Example DBSCAN
To: beginners@haskell.org
Message-ID: <20120619164551.5a2a337e@thor>
Content-Type: text/plain; charset=US-ASCII

I'm sorry the Thread title is not very helpful... I hope this is better

> Hello,
> 
> in order to tune my brain to "functional thinking" I've decided to
> start writing some "real" programs. After having spent a couple of
> days figuring out how to translate an imperative algorithm into
> stateless haskell I beleive I've now managed a simple implementation
> of the DBSCAN clustering algorithm.
> 
> http://privatepaste.com/e6bb4fb665
> 
> This code has several drawbacks (and probably it doesn't work
> correctly after all) but I would like to tune its performance.
> Obviously the code calls several functions multiply with the same
> arguments. Haskell, being stateless should thus always yield the same
> resuts, correct? In this case performing the same calculations
> multiply seems pointless.
> 
> My question: 
> How would you go about improving this code? 
> In any imperative language I would simply cache
> distances/neighborhoods in a matrix. Obviously this is not what the
> way to go in haskell?
> 
> Thank you very much
> 
> _______________________________________________
> Beginners mailing list
> Beginners@haskell.org
> http://www.haskell.org/mailman/listinfo/beginners

------------------------------

Message: 3
Date: Tue, 19 Jun 2012 17:01:26 +0200
From: Ertugrul S?ylemez <e...@ertes.de>
Subject: Re: [Haskell-beginners] Improving Performance
To: beginners@haskell.org
Message-ID: <20120619170126.5c4b5...@tritium.streitmacht.eu>
Content-Type: text/plain; charset="utf-8"

Robert Heum?ller <mail...@heum.de> wrote:

> in order to tune my brain to "functional thinking" I've decided to
> start writing some "real" programs. After having spent a couple of
> days figuring out how to translate an imperative algorithm into
> stateless haskell I beleive I've now managed a simple implementation
> of the DBSCAN clustering algorithm.
>
> http://privatepaste.com/e6bb4fb665

When asking people to review your code, it would be very helpful to
write type signatures for your top-level functions.  This alone makes
code greatly more readable and also tells a lot about the quality right
away.  I advice you to get used to writing type signatures for yourself
as well.

Greets,
Ertugrul

-- 
Not to be or to be and (not to be or to be and (not to be or to be and
(not to be or to be and ... that is the list monad.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: 
<http://www.haskell.org/pipermail/beginners/attachments/20120619/dae1a369/attachment-0001.pgp>

------------------------------

Message: 4
Date: Tue, 19 Jun 2012 17:42:48 +0200
From: Robert Heum?ller <mail...@heum.de>
Subject: Re: [Haskell-beginners] Improving Performance
To: beginners@haskell.org
Message-ID: <20120619174248.2cae37ee@thor>
Content-Type: text/plain; charset=ISO-8859-1

That's a great start, thanks ;)

Am Tue, 19 Jun 2012 17:01:26 +0200
schrieb Ertugrul S?ylemez <e...@ertes.de>:

> Robert Heum?ller <mail...@heum.de> wrote:
> 
> > in order to tune my brain to "functional thinking" I've decided to
> > start writing some "real" programs. After having spent a couple of
> > days figuring out how to translate an imperative algorithm into
> > stateless haskell I beleive I've now managed a simple implementation
> > of the DBSCAN clustering algorithm.
> >
> > http://privatepaste.com/e6bb4fb665
> 
> When asking people to review your code, it would be very helpful to
> write type signatures for your top-level functions.  This alone makes
> code greatly more readable and also tells a lot about the quality
> right away.  I advice you to get used to writing type signatures for
> yourself as well.
> 
> 
> Greets,
> Ertugrul
> 

------------------------------

Message: 5
Date: Tue, 19 Jun 2012 17:20:15 +0100
From: Matt Ford <m...@dancingfrog.co.uk>
Subject: [Haskell-beginners] Understanding how laziness works
To: beginners@haskell.org
Message-ID:
        <ca+fwtn9sd9fqhx4q5wsqbbkq9fgwt9ymx3fomsydvv2qjxh...@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

Apologies for the maybe double post (no subject and hadn't confirmed
the mailing list first time I sent this).

Hi,

Here's a simple expression

  double ( double [1,2,3,4] ) !! 0

The 1st element of a list is

double ( 1*2 : double [2,3,4] ) !! 0
1*2*2 : double ( double [2,3,4 ] ) !! 0
1*2*2
4

calculated without having to traverse the rest of the list due to laziness.

But how far does the laziness extend?  For example, if I take the 2nd element

 double (double [1,2,3,4]) !! 1

Do we have

1*2*2 : double ( 2*2 : double [3,4] ) !! 1
1*2*2 : 2*2*2 : double ( double [3,4] ) !! 1
2*2*2
8

Resulting in 8.

I assume that the 1st list entry 1*2*2 doesn't actually get
calculated, but the expression does get formed?  The second element
(index 1) gets formed, isolated (by !!)  and calculated as the return
value.  And then
things can happily stop.

What I'm checking is that the laziness doesn't some how magically
extend to not having to form the preceding expressions that make up
the resulting list?  Or can Haskell recognize that this is like doing
function composition i.e.,

(double . double) [ 1 2, 3, 4 ] !! 1

Thanks for any answer.

Cheers,

Matt.

p.s, I kinda hope things are how I've assumed as the example finally
shows me some concrete benefit of function composition which  has
never seemed important given it's so easily re-written as application.

------------------------------

Message: 6
Date: Tue, 19 Jun 2012 21:44:10 +0200
From: Robert Heum?ller <mail...@heum.de>
Subject: Re: [Haskell-beginners] Improving Performance
To: beginners@haskell.org
Message-ID: <20120619214410.35dc28a2@thor>
Content-Type: text/plain; charset="iso-8859-1"

Hi again,

I've added the type signatures. What's next?

http://privatepaste.com/26073e4b0e

Thanks

Am Tue, 19 Jun 2012 17:01:26 +0200
schrieb Ertugrul S?ylemez <e...@ertes.de>:

> Robert Heum?ller <mail...@heum.de> wrote:
> 
> > in order to tune my brain to "functional thinking" I've decided to
> > start writing some "real" programs. After having spent a couple of
> > days figuring out how to translate an imperative algorithm into
> > stateless haskell I beleive I've now managed a simple implementation
> > of the DBSCAN clustering algorithm.
> >
> > http://privatepaste.com/e6bb4fb665
> 
> When asking people to review your code, it would be very helpful to
> write type signatures for your top-level functions.  This alone makes
> code greatly more readable and also tells a lot about the quality
> right away.  I advice you to get used to writing type signatures for
> yourself as well.
> 
> 
> Greets,
> Ertugrul
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 490 bytes
Desc: not available
URL: 
<http://www.haskell.org/pipermail/beginners/attachments/20120619/a26caa00/attachment-0001.pgp>

------------------------------

Message: 7
Date: Tue, 19 Jun 2012 22:25:05 +0200
From: Ertugrul S?ylemez <e...@ertes.de>
Subject: Re: [Haskell-beginners] Improving Performance
To: beginners@haskell.org
Message-ID: <20120619222505.656c5...@tritium.streitmacht.eu>
Content-Type: text/plain; charset="utf-8"

Robert Heum?ller <mail...@heum.de> wrote:

> I've added the type signatures. What's next?
>
> http://privatepaste.com/26073e4b0e

Great, now that makes it much easier to reason about performance.  A
quick look at the type signatures tells me that your next step is to use
the right tool for the job.  In Haskell this basically means to use the
right data and control structures.

First of all know when not to use lists.  A good intuition is to ask
yourself whether the lists will actually live in memory.  If no, then
lists are fine.  If yes (like in your case), examine your usage.  If you
use them like stacks (working on the head only) that's probably fine
(it's not in your case).  However, if you fold and unfold a lot or index
individual elements you probably want a different data structure like a
vector or a map.  This is a somewhat involved topic and needs a lot of
training, but making the right decision here will give you the main
performance boost.

Note:  Whatever you do, stay in pure land.  Always use immutable data
structures.  If your code is too slow, you may be using the right data
structure the wrong way.  Going mutable is almost always the wrong way.
Again: It takes some training, especially in Haskell where (as a
beginner) you may not see directly what is actually happening.

Also find spots where sharing can be used.  Unshared example:

    f x' + f x'

Shared version:

    let x = f x' in x + x

Haskell compilers don't do that automatically for you, because sharing
trades memory for run-time performance, and the outcome is hard to
predict for a compiler.  I suggest experimenting.

Greets,
Ertugrul

-- 
Not to be or to be and (not to be or to be and (not to be or to be and
(not to be or to be and ... that is the list monad.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: 
<http://www.haskell.org/pipermail/beginners/attachments/20120619/cff0a484/attachment-0001.pgp>

------------------------------

Message: 8
Date: Wed, 20 Jun 2012 00:55:04 +0400
From: Alexander Batischev <eual...@gmail.com>
Subject: Re: [Haskell-beginners] Understanding how laziness works
To: beginners@haskell.org
Message-ID: <20120619205504.GB8505@antaeus>
Content-Type: text/plain; charset="us-ascii"

Hello!

On Tue, Jun 19, 2012 at 05:20:15PM +0100, Matt Ford wrote:
> I assume that the 1st list entry 1*2*2 doesn't actually get
> calculated, but the expression does get formed?  The second element
> (index 1) gets formed, isolated (by !!)  and calculated as the return
> value.  And then
> things can happily stop.

You're right. The spine of the list is being formed, but it doesn't
necessarily evaluate values in the cells of the list.

Basically, you can imagine some linked list where each cell have two
pointers: one pointing to the next element (or thunk) and another
pointing to the value (or thunk) of that cell. When you ask for some
element of the list, linked list is being built up to the element you
asked for, but only that thunks that are "next elements" are being
evaluated - values that aren't used stay being thunks.

Now, in your particular example you have constant index and constant
list, so I'm pretty sure any Haskell compiler would simply execute the
code and put the result into program. But that's only an expectation,
not knowledge, so please be critical about it.

-- 
Regards,
Alexander Batischev

PGP key 356961A20C8BFD03
Fingerprint: CE6C 4307 9348 58E3 FD94  A00F 3569 61A2 0C8B FD03

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: Digital signature
URL: 
<http://www.haskell.org/pipermail/beginners/attachments/20120620/2fa85209/attachment-0001.pgp>

------------------------------

Message: 9
Date: Tue, 19 Jun 2012 23:01:32 +0100
From: Matt Ford <m...@dancingfrog.co.uk>
Subject: [Haskell-beginners] Function application versus function
        composition performance
To: beginners@haskell.org
Message-ID:
        <CA+FwTn_cXy=7zsjsas2rx6nzzbgwur7kpokwmklqmmmvwx7...@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

Hi All,

My last question got me thinking (always dangerous): an expression
that doubles a list and takes the 2nd element (index 1), I assume,
goes through the following steps.

double (double [1,2,3,4]) !! 1
double ( 1*2 : double [2,3,4]) !! 1
1*2*2 : double ( double [2,3,4] ) !! 1
1*2*2 : double ( 2*2 : double [3,4] ) !! 1
1*2*2 : 2*2*2 : double ( double [3,4] ) !! 1
2*2*2
8

Up until the element requested all the proceeding elements have to
have the expression formed as Haskell process the calculation.

Now, I want to compare this to using function composition

( double . double ) [ 1 ,2, 3, 4 ] !! 1

This is the bit I'm unsure of - what does the composition look like.
It is easy to see that it could be simplified to something like:

( double . double) (x:xs) = x*4 : (double . double) xs

This would mean the steps could be written

(double . double) [ 1,2,3,4 ] !! 1
(1*4 : (double.double) [2,3,4]) !! 1
(1*4 : 2*4 : (double.double) [ 3,4 ]) !! 1
2*4
8

Ignoring the start and end steps, it will be half the number of steps
compared to the application version.  Significant then, over long
lists.

So is this true, are composite functions simplified in Haskell in
general so that they may have improved performance over function
application?

I've seen posts on the internet that say it's a matter of style only:
http://stackoverflow.com/questions/3030675/haskell-function-composition-and-function-application-idioms-correct-us.
 But my reasoning suggests it could be more than that.

Perhaps, function application is similarly optimised - maybe by
replacing all functions applications with composition and then
simplifying?  Or maybe the simplifying/optimisation step never
happens?  As you can see I'm just guessing at things :-)  But it's
nice to wonder.

Many thanks for any pointers,

Matt.

------------------------------

_______________________________________________
Beginners mailing list
Beginners@haskell.org
http://www.haskell.org/mailman/listinfo/beginners

End of Beginners Digest, Vol 48, Issue 24
*****************************************

Beginners Digest, Vol 48, Issue 24

Reply via email to