Send Beginners mailing list submissions to
        beginners@haskell.org

To subscribe or unsubscribe via the World Wide Web, visit
        http://www.haskell.org/mailman/listinfo/beginners
or, via email, send a message with subject or body 'help' to
        beginners-requ...@haskell.org

You can reach the person managing the list at
        beginners-ow...@haskell.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Beginners digest..."


Today's Topics:

   1. Re:  Fwd: Implementing a spellchecker - problem with
      Data.HashTable performance (Karol Samborski)
   2. Re:  Fwd: Implementing a spellchecker - problem with
      Data.HashTable performance (Brent Yorgey)
   3. Re:  Two questions for a production project... (umptious)
   4. Re:  Fwd: Implementing a spellchecker - problem with
      Data.HashTable performance (umptious)
   5. Re:  Fwd: Implementing a spellchecker - problem with
      Data.HashTable performance (Karol Samborski)
   6. Re:  Training tasks (umptious)


----------------------------------------------------------------------

Message: 1
Date: Fri, 20 Apr 2012 12:30:38 +0200
From: Karol Samborski <edv.ka...@gmail.com>
Subject: Re: [Haskell-beginners] Fwd: Implementing a spellchecker -
        problem with Data.HashTable performance
To: Rados?aw Szymczyszyn <lav...@gmail.com>
Cc: beginners@haskell.org
Message-ID:
        <CACe2dTt7vt2q3ypbaB+XU35CU=t1jn7bnucrzzcf5gwfzib...@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8

W dniu 20 kwietnia 2012 11:33 u?ytkownik Rados?aw Szymczyszyn
<lav...@gmail.com> napisa?:
> Hello fellow Haskellers,
>
> It's my first post to this list, so at first: Greetings to everyone!
>
> I've wanted to pick up Haskell for some time and I've run in
> to a perfect (at least it seemed so) oocasion at my Natural Language
> Processing course. At first I tried using Python, but it turned out
> that some extra performance wouldn't hurt and I hoped Haskell would
> give me some boost without forcing me to sacrifice high-level
> expressiveness at the same time.
>
> The first lesson was to use ByteStrings instead of ordinary Strings
> for IO. Then I needed to build a dictionary of words found in the
> source text. That's the problematic part - I've tried using a few
> structures, namely Data.Map and Data.HashTable from base package and
> Data.HashMap from unordered-containers. In all cases I couldn't reach
> acceptable performance. As of that, I'd like to ask about some advice
> and comments about the following examples.
>
> First example [1] just builds a HashTable of ~1 500 000 ByteStrings.
> The program takes about 11 seconds to execute. Second example [2] does
> roughly the same, but taking the HashTable data from a file (just a
> dictionary of valid Polish forms). Reading data itself is not a
> problem, as I've measured the execution time of a program which just
> reads the file (and prints its last line to actually force the read,
> hence "handle" Haskell's laziness) and it takes no more than a second.
> Coupled together reading and building, however, takes unbearably long.
> I didn't wait till the program stopped and just terminated it, but it
> had been running for ~10 minutes already.
>
> The most disappointing fact is that a roughly equivalent (and rather
> effortless to write in contrast to the Haskell example - I'm just
> learning though) Python test which just reads everything into a dict
> takes about 10 seconds to execute (reading and updating the dict of
> course). Writing the spellchecker in Python seemed pretty
> straightforward for me (but its certainly not the only language I
> know; I've had some Java (who didn't nowadays), system level C and am
> daily working in Erlang). I understand of course, that the most
> convenient tool is generally the one you know the best, but my
> experience writing the Haskell equivalent reached both extremes -
> effortless expression of the logic, but frustration because of the
> unexplainable (for me at least) performance behaviour.
>
> The question is why is it so slow? (or AM I RILY DAT DUMB? ;)
>
> The link to examples:
> [1] https://gist.github.com/2411699 TestHashTableArtificial
> [2] https://gist.github.com/2411699 TestReadIntoHashTable
>
> I'm looking forward to somebody shedding some light on this. Thanks in 
> advance!
>
> Best regards,
> Radek Szymczyszyn

Hi!

Did you try with -O2?

I've created file "formy.utf8" filled with 1400000 lines of word
"?y?niejszymi" and here is my output:

$ ghc --make -O2 TestReadIntoHashTable.hs -o test -rtsopts
$ time ./test +RTS -K100m -RTS
1400000

real    0m3.265s
user    0m3.097s
sys     0m0.143s

Best,
Karol Samborski



------------------------------

Message: 2
Date: Fri, 20 Apr 2012 07:52:56 -0400
From: Brent Yorgey <byor...@seas.upenn.edu>
Subject: Re: [Haskell-beginners] Fwd: Implementing a spellchecker -
        problem with Data.HashTable performance
To: beginners@haskell.org
Message-ID: <20120420115256.ga22...@seas.upenn.edu>
Content-Type: text/plain; charset=utf-8

I seem to recall that Data.HashTable is quite slow and not recommended
for use.  Try using Data.HashMap.Lazy/Strict from the
unordered-containers package instead.

-Brent

On Fri, Apr 20, 2012 at 11:33:51AM +0200, Rados?aw Szymczyszyn wrote:
> Hello fellow Haskellers,
> 
> It's my first post to this list, so at first: Greetings to everyone!
> 
> I've wanted to pick up Haskell for some time and I've run in
> to a perfect (at least it seemed so) oocasion at my Natural Language
> Processing course. At first I tried using Python, but it turned out
> that some extra performance wouldn't hurt and I hoped Haskell would
> give me some boost without forcing me to sacrifice high-level
> expressiveness at the same time.
> 
> The first lesson was to use ByteStrings instead of ordinary Strings
> for IO. Then I needed to build a dictionary of words found in the
> source text. That's the problematic part - I've tried using a few
> structures, namely Data.Map and Data.HashTable from base package and
> Data.HashMap from unordered-containers. In all cases I couldn't reach
> acceptable performance. As of that, I'd like to ask about some advice
> and comments about the following examples.
> 
> First example [1] just builds a HashTable of ~1 500 000 ByteStrings.
> The program takes about 11 seconds to execute. Second example [2] does
> roughly the same, but taking the HashTable data from a file (just a
> dictionary of valid Polish forms). Reading data itself is not a
> problem, as I've measured the execution time of a program which just
> reads the file (and prints its last line to actually force the read,
> hence "handle" Haskell's laziness) and it takes no more than a second.
> Coupled together reading and building, however, takes unbearably long.
> I didn't wait till the program stopped and just terminated it, but it
> had been running for ~10 minutes already.
> 
> The most disappointing fact is that a roughly equivalent (and rather
> effortless to write in contrast to the Haskell example - I'm just
> learning though) Python test which just reads everything into a dict
> takes about 10 seconds to execute (reading and updating the dict of
> course). Writing the spellchecker in Python seemed pretty
> straightforward for me (but its certainly not the only language I
> know; I've had some Java (who didn't nowadays), system level C and am
> daily working in Erlang). I understand of course, that the most
> convenient tool is generally the one you know the best, but my
> experience writing the Haskell equivalent reached both extremes -
> effortless expression of the logic, but frustration because of the
> unexplainable (for me at least) performance behaviour.
> 
> The question is why is it so slow? (or AM I RILY DAT DUMB? ;)
> 
> The link to examples:
> [1] https://gist.github.com/2411699 TestHashTableArtificial
> [2] https://gist.github.com/2411699 TestReadIntoHashTable
> 
> I'm looking forward to somebody shedding some light on this. Thanks in 
> advance!
> 
> Best regards,
> Radek Szymczyszyn
> 
> _______________________________________________
> Beginners mailing list
> Beginners@haskell.org
> http://www.haskell.org/mailman/listinfo/beginners



------------------------------

Message: 3
Date: Fri, 20 Apr 2012 14:38:34 +0100
From: umptious <umpti...@gmail.com>
Subject: Re: [Haskell-beginners] Two questions for a production
        project...
To: Michael Orlitzky <mich...@orlitzky.com>
Cc: beginners@haskell.org
Message-ID:
        <cae20bnv0b+wweshuclzap08xrcxhd1bh_1i1_3u-aas2qwx...@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

This is the sort of question you might want to ask on Haskell cafe too.

But

- This really comes down to whether the GC can trusted, and I would
astonished if it was otherwise. It would make writing web frameworks in
Haskell rather pointless. (Otoh I can't find a list of real world users for
Yesod etc.)

- A lot of people use XMonad to manage their desktops and just leave their
machines on *forever.*

- If Haskell does prove to be unsuitable (which is very unlikely) then
Erlang is a safe bet. It's about the most reliable language there is - it
was designed for telecoms. It's much simpler to learn than Haskell,
although less funky (eg no lazy evaluation) and closer to a scripting
language in speed. Get a copy of Programming *Erlang*: Software for a
Concurrent World by Joe *Armstrong* and with even minimal Haskell
experience you should be writing code straight away.

On 20 April 2012 03:43, Michael Orlitzky <mich...@orlitzky.com> wrote:

> On 04/19/2012 07:40 PM, Mike Meyer wrote:
> > I've got a project coming up that I could use haskell for, providing I
> > can convince myself that it's appropriate. The critical question is:
> >
> > Anyone using Haskell in a long-running production application? I'm
> > talking about something where the program would run for weeks or
> > months non-stop, with either multiple threads or multiple copies.
>
> I have a rather stupid application that parses feeds from the Twitter
> API. For each username given on the command line, it calls forkIO and
> the new thread enters a recursive loop:
>
>  recurse x y z = do
>    ...
>    recurse x y z
>
> forever and ever.
>
> Contrary to my expectations (I still basically don't know what I'm doing
> in Haskell) it seems to run in constant space for months at a time.
>
> _______________________________________________
> Beginners mailing list
> Beginners@haskell.org
> http://www.haskell.org/mailman/listinfo/beginners
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://www.haskell.org/pipermail/beginners/attachments/20120420/8768de00/attachment-0001.htm>

------------------------------

Message: 4
Date: Fri, 20 Apr 2012 14:45:25 +0100
From: umptious <umpti...@gmail.com>
Subject: Re: [Haskell-beginners] Fwd: Implementing a spellchecker -
        problem with Data.HashTable performance
Cc: beginners@haskell.org
Message-ID:
        <cae20bnv3fex8etersm2ekl-5vzuvw5z8vrhwz3sakgyx4wt...@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

While I think Misters Samborski and Yorgey have fixed the performance
problem, I'm still curious about the apparent synergy effect:

> First example [1] just builds a HashTable of ~1 500 000 ByteStrings.
> > The program takes about 11 seconds to execute.
>


> > Reading data itself is not a
> > problem, as I've measured the execution time of a program which just
> > reads the file (and prints its last line to actually force the read,
> > hence "handle" Haskell's laziness) and it takes no more than a second.
>


> > Coupled together reading and building, however, takes unbearably long.
> > I didn't wait till the program stopped and just terminated it, but it
> > had been running for ~10 minutes already.
>

???
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://www.haskell.org/pipermail/beginners/attachments/20120420/474e2415/attachment-0001.htm>

------------------------------

Message: 5
Date: Fri, 20 Apr 2012 16:01:53 +0200
From: Karol Samborski <edv.ka...@gmail.com>
Subject: Re: [Haskell-beginners] Fwd: Implementing a spellchecker -
        problem with Data.HashTable performance
To: umptious <umpti...@gmail.com>
Cc: beginners@haskell.org
Message-ID:
        <cace2dtuhrobegtxuuq1qrql1b8b8rtznpttcepdnt4cvvn0...@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8

W dniu 20 kwietnia 2012 15:45 u?ytkownik umptious <umpti...@gmail.com> napisa?:
> While I think Misters Samborski and Yorgey have fixed the performance
> problem, I'm still curious about the apparent synergy effect:
>

Hi,

I wrote about the example program of that effect last time. With -O2
flag it took 3 sek.

Best,
Karol Samborski



------------------------------

Message: 6
Date: Fri, 20 Apr 2012 15:02:21 +0100
From: umptious <umpti...@gmail.com>
Subject: Re: [Haskell-beginners] Training tasks
To: beginners@haskell.org
Message-ID:
        <CAE20bNuGd+MaWL6Jme8zqo33ePUo-tKi82ZO-=xwggcxcac...@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

On 19 April 2012 20:20, Michael Litchard <mich...@schmong.org> wrote:

> Hi, I noticed one of your suggestions
>
> > - A version of the classic Traveller/Elite trading system - a really nice
> > little toy business rules system that can be code either minimally or to
> > show off the funkier features of a language to get more flexibility and
> more
> > interesting pricing rules. Minimal version would be a few tens of lines
> and
> > the config file declaring item
>
> I couldn't find anything on Elite, could you give me a reference?
> I found a reference concerning Traveller, but if you had anything in
> mind, it may be helpful.
>
> Thanks :)
>

If you download Oolite, available from most Linux repositories, it should
have the same trading rules as the original Elite (which it stole from the
Traveller pen and paper game.) There's a space trading game for the Palm
Pilot that has improved more interesting rules - maybe you can find a Palm
emulator if you don't have one? -
http://en.wikipedia.org/wiki/Space_Trader_%28Palm_OS%29

Basically, you like goods that can be traded and you have worlds to sell
them on. Worlds have government types and economic types - and in more
complex variants ecologies and events - that affect the price and supply of
goods. Eg Fish will be cheap on a Farming world with an Ocean environment,
but expensive on an Industrial Desert world especially if it is Rich; Guns
will sell for more if a War is going on; Survival Equipment and any Food
Item will sell for more in the aftermath of a disaster. You can also model
legality based on a Government type.

The challenge is to get a system that allows trade rules to be concise and
expressive while being written in a form that can be edited by a
non-programmer - the config can be an .hs file, but it should be one that
even a non-coder can understand and edit. It's a nice problem because it's
scalable (you increase the complexity of the rules) and it has a lot of
characteristics of real business systems. I knocked out the core of a
solution last night - together with the recursive Shape Editor it's one of
the best language test programs I know - and should have something nice
around Monday if you want a copy.

- Jay
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://www.haskell.org/pipermail/beginners/attachments/20120420/e5675244/attachment.htm>

------------------------------

_______________________________________________
Beginners mailing list
Beginners@haskell.org
http://www.haskell.org/mailman/listinfo/beginners


End of Beginners Digest, Vol 46, Issue 37
*****************************************

Reply via email to