Beginners Digest, Vol 40, Issue 11

beginners-request Sun, 09 Oct 2011 14:40:53 -0700

Send Beginners mailing list submissions to
        beginners@haskell.org

To subscribe or unsubscribe via the World Wide Web, visit
        http://www.haskell.org/mailman/listinfo/beginners
or, via email, send a message with subject or body 'help' to
        beginners-requ...@haskell.org

You can reach the person managing the list at
        beginners-ow...@haskell.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Beginners digest..."

Today's Topics:

   1.  How would you improve this program? (Lorenzo Bolla)
   2. Re:  How would you improve this program? (Michael Xavier)
   3. Re:  How would you improve this program? (Chadda? Fouch?)

----------------------------------------------------------------------

Message: 1
Date: Sun, 9 Oct 2011 21:11:35 +0100
From: Lorenzo Bolla <lbo...@gmail.com>
Subject: [Haskell-beginners] How would you improve this program?
To: beginners@haskell.org
Message-ID:
        <cadjgtrw02sfarmbwnszbqn4uj2f6bclyctm0tg+4mnc1nak...@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi all,
I'm new to Haskell and I'd like you to take a look at one of my programs and
tell me how you would improve it (in terms of efficiency, style, and so
on!).

The source code is here:
https://github.com/lbolla/stanford-cs240h/blob/master/lab1/lab1.hs
The program is an implementation of this problem:
http://www.scs.stanford.edu/11au-cs240h/labs/lab1.html (basically, counting
how many times a word appear in a text.)
(I'm not a Stanford student, so by helping me out you won't help me to cheat
my exam, don't worry!)

I've implemented 3 versions of the algorithm:

   1. a Haskell version using the standard "sort": read all the words from
   stdin, sort them and group them.
   2. a Haskell version using map: read all the words from stdin, stick each
   word in a Data.Map incrementing a counter if the word is already present in
   the map.
   3. a Python version using defaultdict.

I timed the different versions and the results are here:
https://github.com/lbolla/stanford-cs240h/blob/master/lab1/times.png.
The python version is the quickest (I stripped out the fancy formatting
before benchmarking, so IO is not responsible for the time difference).
Any comments on the graph, too?

Thanks a lot!
L.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://www.haskell.org/pipermail/beginners/attachments/20111009/5d58dabb/attachment-0001.htm>

------------------------------

Message: 2
Date: Sun, 9 Oct 2011 13:52:19 -0700
From: Michael Xavier <nemesisdes...@gmail.com>
Subject: Re: [Haskell-beginners] How would you improve this program?
To: Lorenzo Bolla <lbo...@gmail.com>
Cc: beginners@haskell.org
Message-ID:
        <CANk=zmhueh84vhc45qsjcg4xhoq4eyctr532yopmewueycu...@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

This is a stylistic issue and it is certainly objective but I personally
prefer using "where" a lot as opposed to let ... in and lambdas. You use
lambdas inline with maps a lot. That is perfectly valid and your lambdas are
pretty simple, but I find it easier to read to figure out a function name
that is descriptive and put it in a where rather than a lambda. That way,
someone else (or perhaps you at a later date) can easily scan through the
function definitions and read what is being done in more or less plain
english.

Likewise, my issue with let .. in most of the time is just that it seems
backwards to me. I prefer to read the broad strokes of code first and the
specifics later, which is why where is more attractive to me.

It otherwise looks like pretty clean code to me. Certainly cleaner than the
stuff I started out with.

Just my two cents.

On Sun, Oct 9, 2011 at 1:11 PM, Lorenzo Bolla <lbo...@gmail.com> wrote:

> Hi all,
> I'm new to Haskell and I'd like you to take a look at one of my programs
> and tell me how you would improve it (in terms of efficiency, style, and so
> on!).
>
> The source code is here:
> https://github.com/lbolla/stanford-cs240h/blob/master/lab1/lab1.hs
>  The program is an implementation of this problem:
> http://www.scs.stanford.edu/11au-cs240h/labs/lab1.html (basically,
> counting how many times a word appear in a text.)
>  (I'm not a Stanford student, so by helping me out you won't help me to
> cheat my exam, don't worry!)
>
> I've implemented 3 versions of the algorithm:
>
>    1. a Haskell version using the standard "sort": read all the words from
>    stdin, sort them and group them.
>    2. a Haskell version using map: read all the words from stdin, stick
>    each word in a Data.Map incrementing a counter if the word is already
>    present in the map.
>    3. a Python version using defaultdict.
>
> I timed the different versions and the results are here:
> https://github.com/lbolla/stanford-cs240h/blob/master/lab1/times.png.
>  The python version is the quickest (I stripped out the fancy formatting
> before benchmarking, so IO is not responsible for the time difference).
> Any comments on the graph, too?
>
> Thanks a lot!
> L.
>
> _______________________________________________
> Beginners mailing list
> Beginners@haskell.org
> http://www.haskell.org/mailman/listinfo/beginners
>
>

-- 
Michael Xavier
http://www.michaelxavier.net
LinkedIn <http://www.linkedin.com/pub/michael-xavier/13/b02/a26>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://www.haskell.org/pipermail/beginners/attachments/20111009/56301e1c/attachment-0001.htm>

------------------------------

Message: 3
Date: Sun, 9 Oct 2011 23:40:21 +0200
From: Chadda? Fouch? <chaddai.fou...@gmail.com>
Subject: Re: [Haskell-beginners] How would you improve this program?
To: Lorenzo Bolla <lbo...@gmail.com>
Cc: beginners@haskell.org
Message-ID:
        <CANfjZRZUYUp_rdvMrq5AG5wQQFN=htyko7rsgfmrv4cq18h...@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8

On Sun, Oct 9, 2011 at 10:11 PM, Lorenzo Bolla <lbo...@gmail.com> wrote:
> I've implemented 3 versions of the algorithm:
>
> a Haskell version using the standard "sort": read all the words from stdin,
> sort them and group them.
> a Haskell version using map: read all the words from stdin, stick each word
> in a Data.Map incrementing a counter if the word is already present in the
> map.

You seem to be using fromListWith which is lazy, in other words you're
accumulating enormous (1 + 1 + ... + 1) thunks in your map especially
with the higher word count tests. It might be that GHC is optimizing
that for you but I strongly doubt it. You should probably replace it
by a stricter version :

> fromListWith' f xs = foldl' ins empty xs
>     where ins m (k,x) = insertWith' f k x m

I hope that helps, otherwise it seems pretty good (but I only gave it
a passing glance).

-- 
Jeda?

------------------------------

_______________________________________________
Beginners mailing list
Beginners@haskell.org
http://www.haskell.org/mailman/listinfo/beginners

End of Beginners Digest, Vol 40, Issue 11
*****************************************

Beginners Digest, Vol 40, Issue 11

Reply via email to