Re: [Haskell-cafe] OCaml list sees abysmal Language Shootout results

2004-10-06 Thread Greg Buchholz
Ketil Malde wrote:
> Ketil Malde <[EMAIL PROTECTED]> writes:
> 
> > To get memory consumption down, I tried a strict "update" function:
> 
> >update k fm = let x = (get hash1 k + get fm k) 
> >  in x `seq` addToFM fm k x
> 
> > which slowed the program down(!), 

Yes that fixes(?) it.  The strict update removes the space leak and
makes the FiniteMap perform as expected.

> 
> I wonder if this isn't due to never evaluating the values for
> "foo_2" to "foo_9998" because of laziness?

Maybe. On a whim I thought that maybe unevaluated addition thunks
from addToFM_C were piling up, so I changed to addListToFM_C instead...

 let update k fm = addListToFM_C (+) fm $ replicate (read n) (k,gethash1 k) 
 let res = foldr update hash2 keys

...but that hardly made a difference.  So the search continues.


Greg Buchholz
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] OCaml list sees abysmal Language Shootout results

2004-10-06 Thread Ketil Malde
Ketil Malde <[EMAIL PROTECTED]> writes:

> To get memory consumption down, I tried a strict "update" function:

>update k fm = let x = (get hash1 k + get fm k) 
>  in x `seq` addToFM fm k x

> which slowed the program down(!), 

I wonder if this isn't due to never evaluating the values for
"foo_2" to "foo_9998" because of laziness?

> BTW, I looked at the shootout web pages, but I couldn't find the
> specification for any of the benchmarks.  What is and isn't allowed? 

For instance, changing the order of of the updates shaves another
10-20% off the time (because of cache-friendliness, I suppose):

  - let res = foldr update hash2 (concat $ replicate (read n) keys)
  + let res = foldr update hash2 (concat $ map (replicate (read n)) keys)

-kzm
-- 
If I haven't seen further, it is by standing in the footprints of giants
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] OCaml list sees abysmal Language Shootout results

2004-10-06 Thread Ketil Malde
Greg Buchholz <[EMAIL PROTECTED]> writes:

> I've been looking at the other shootout results (with the hope of
> learning something about making haskell programs faster/less memory
> hungry) and I couldn't quite figure out why the "Hashes, part II" test
> comsumes so much memory ( http://shootout.alioth.debian.org/bench/hash2/ ). 
> So I started to try heap profiling, and generated the following graphs
> for the different types of profiles...

> biography => http://sleepingsquirrel.org/haskell/hash2_b.ps
> retainer  => http://sleepingsquirrel.org/haskell/hash2_r.ps
> closure   => http://sleepingsquirrel.org/haskell/hash2_d.ps
> type  => http://sleepingsquirrel.org/haskell/hash2_y.ps
> cost cntr => http://sleepingsquirrel.org/haskell/hash2_c.ps
>
> ...but I have a hard time figuring out how to prevent something like
> "stg_ap_3_upd_info" or "void" cells from consuming so much memory.

One thing you could do, is to move the pure definitions (constants and
functions) out of the monad.  This will make them separate cost
centres, with their own profile information.  I toyed with this, but
admittedly, it didn't change much in this case.  I think it is better
style, though.

A simple way to improve speed marginally, is to specify Int instead of
letting things default to Integer.  A more complex way, saving about
60% of the time, is to use unboxed arrays instead of strings for the
keys - memory consumption seems to be the same, though. 

To get memory consumption down, I tried a strict "update" function:

   update k fm = let x = (get hash1 k + get fm k) 
 in x `seq` addToFM fm k x

which slowed the program down(!), but reduced memory consumption from
about 25Mb to 1.5Mb.  So it seems that the memory consumption is due
to unevaluated values in the FMs.

BTW, I looked at the shootout web pages, but I couldn't find the
specification for any of the benchmarks.  What is and isn't allowed? 

-kzm


import System (getArgs) 
import Data.FiniteMap 
import Data.Array.Unboxed
import Maybe

type Key = UArray Int Char
type Map = FiniteMap (UArray Int Char) Int

hash1, hash2 :: Map
hash1 = listToFM $ zip keys [0..] 
hash2 = listToFM $ zip keys (repeat 0) 

keys :: [Key]
keys = map (\x -> listArray (1,4+length (show x)) ("foo_" ++ show x)) [0..] 
get :: Map -> Key -> Int
get fm k = fromJust $ lookupFM fm k 

update :: Key -> Map -> Map
update k fm = let x = (get hash1 k + get fm k) in x `seq` addToFM fm k x

foo_1 = keys!!1
foo_ = keys!!

main = do 
 [n] <- getArgs  
 let res = foldr update hash2 (concat $ replicate (read n) keys) 
 putStrLn $ unwords $ map show [get hash1 foo_1,
get hash1 foo_, 
get res foo_1, 
get res foo_] 

-- 
If I haven't seen further, it is by standing in the footprints of giants
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] OCaml list sees abysmal Language Shootout results

2004-10-05 Thread Greg Buchholz
Keith Wansbrough wrote:
> I just saw this on the OCaml list (in a posting by "Rafael 'Dido' 
> Sevilla" <[EMAIL PROTECTED]> in the "Observations on OCaml vs. Haskell" 
> thread).  I can't believe that a simple "wc" implementation should be 
> 570 times slower in Haskell than OCaml - could someone investigate and 
> fix the test?

I've been looking at the other shootout results (with the hope of
learning something about making haskell programs faster/less memory
hungry) and I couldn't quite figure out why the "Hashes, part II" test
comsumes so much memory ( http://shootout.alioth.debian.org/bench/hash2/ ). 
So I started to try heap profiling, and generated the following graphs
for the different types of profiles...

biography => http://sleepingsquirrel.org/haskell/hash2_b.ps
retainer  => http://sleepingsquirrel.org/haskell/hash2_r.ps
closure   => http://sleepingsquirrel.org/haskell/hash2_d.ps
type  => http://sleepingsquirrel.org/haskell/hash2_y.ps
cost cntr => http://sleepingsquirrel.org/haskell/hash2_c.ps

...but I have a hard time figuring out how to prevent something like
"stg_ap_3_upd_info" or "void" cells from consuming so much memory.
Anyone have pointers on how to best use the profile information?  I'm
still trying to digest "Heap Profiling for Space Efficiency"
http://portal.acm.org/citation.cfm?id=734156 
Are there any other related papers out there?  (Of course it might be
the case that I need a FiniteMap tutorial)

Here's the code in question...

import System (getArgs)
import Data.FiniteMap

main = do
 [n] <- getArgs 
 let get fm k = lookupWithDefaultFM fm 0 k
 let keys = map (\x -> "foo_" ++ show x) [0..]
 let hash1 = listToFM $ zip keys [0..]
 let hash2 = listToFM $ zip keys (repeat 0)
 let update k fm = addToFM_C (+) fm k (get hash1 k)
 let res = foldr update hash2 (concat $ replicate (read n) keys)
 putStrLn $ unwords $ map show [get hash1 "foo_1",
get hash1 "foo_",
get res "foo_1",
get res "foo_"]



Thanks,

Greg Buchholz
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] OCaml list sees abysmal Language Shootout results

2004-09-30 Thread Ketil Malde
Ketil Malde <[EMAIL PROTECTED]> writes:

>> wc :: !(Int,Int,Int) -> Char -> (Int, Int, Int)

> I'm not sure if that was your question

Sorry about that, brain malfunction, bangs are for data declarations,
I'll get that cup of coffee now. 

I guess what you really want to do, is to put some `seq`s in there.
Something like: 

wc (cs,ws,ls) ... = cs `seq` ws `seq` ls `seq` ...the def. of wc...

which evaluates the Ints before doing anything else.  Or use (!$)
(like function application ($), but strict).

-kzm
-- 
If I haven't seen further, it is by standing in the footprints of giants
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] OCaml list sees abysmal Language Shootout results

2004-09-30 Thread Greg Buchholz
Georg Martius wrote:
> Some more general comment: The code for the shootout doesn't need to be 
> extremly fast in my eyes, it needs to be elegant and reasonable at 
> performance and memory consuptions (In this order). I don't want to say 
> that Thomaszs solution is bad, but it is not a typical Haskell style 
> application. If someone (not haskeller) looks at the implementation it 
> should be very obvious and clear.

It might also be nice if the code would run under the other haskell
compliers like Hugs and nhc98 right out-of-the-box.


Greg Buchholz 
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] OCaml list sees abysmal Language Shootout results

2004-09-30 Thread Greg Buchholz
Andrew Cheadle wrote:
> 
> +RTS -Sstderr -RTS and +RTS -sstderr -RTS will probably indicate why.
> I'd be surprised if the amount of data copied for the semi-space
> collector isn't much less than for the generational.

Ahh. Data copied with '-G1' = 58MB vs. 203MB without.  For posterities 
sake, here are the numbers...

With '-G1'
---
306,616,872 bytes allocated in the heap
 58,844,344 bytes copied during GC
 99,316 bytes maximum residency (1169 sample(s))

   1169 collections in generation 0 (  0.62s)

  1 Mb total memory in use

  INIT  time0.00s  (  0.00s elapsed)
  MUT   time0.68s  (  0.71s elapsed)
  GCtime0.62s  (  0.68s elapsed)
  EXIT  time0.00s  (  0.00s elapsed)
  Total time1.30s  (  1.39s elapsed)

  %GC time  47.7%  (48.9% elapsed)

  Alloc rate450,907,164 bytes per MUT second

  Productivity  52.3% of total user, 48.9% of total elapsed


Without
---
306,616,872 bytes allocated in the heap
203,339,812 bytes copied during GC
109,088 bytes maximum residency (131 sample(s))

   1169 collections in generation 0 (  2.22s)
131 collections in generation 1 (  0.05s)

  2 Mb total memory in use

  INIT  time0.00s  (  0.00s elapsed)
  MUT   time0.79s  (  0.92s elapsed)
  GCtime2.27s  (  2.23s elapsed)
  EXIT  time0.00s  (  0.00s elapsed)
  Total time3.06s  (  3.15s elapsed)

  %GC time  74.2%  (70.8% elapsed)

  Alloc rate388,122,622 bytes per MUT second

  Productivity  25.8% of total user, 25.1% of total elapsed


Greg Buchholz
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] OCaml list sees abysmal Language Shootout results

2004-09-30 Thread Andrew Cheadle
Hi Greg

>Anyone have an explaination for the 2x speed
>increase for running Kevin's version with '+RTS -G1'?

+RTS -Sstderr -RTS and +RTS -sstderr -RTS will probably indicate why.
I'd be surprised if the amount of data copied for the semi-space
collector isn't much less than for the generational.

Chances are that data is dying off very quickly and very little is
being copied for the semi-space collector - the allocation area
has a variable sizing policy and this size of allocation area is
sufficient for the majority of the data to die off before it is filled
and a GC kicks in - hence very little is copied. However, for the
generational collector, the nursery is of fixed size. Here, the lifetime
of the data is probably longer than the time it takes for the nursery to
be filled and a minor GC kicks in and the data promoted to generation 1
(hence copied) where it then probably dies off but can't be collected
until a major collection kicks in.

Probably, or something like that ;-)

Cheers

Andy

*
*  Andrew Cheadleemail:  [EMAIL PROTECTED] *
*  Department of Computing   http://www.doc.ic.ac.uk/~amc4/ *
*  Imperial College *
*  University of London *
*
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] OCaml list sees abysmal Language Shootout results

2004-09-30 Thread Greg Buchholz
Malcolm Wallace wrote:
> Here are the results, in seconds, on my machine (2.4GHz x86/Linux)
> for the suggested input (N=500) from the shootout site.  All Haskell
> versions were compiled with ghc-5.04.2 -O2.

I thought I'd take a stab at timing a few of the examples with
different compiler versions to see what difference that would make
(ghc-6.2.1 vs. ghc-6.3.20040928).  I compared Kevin Everets' version with
the two Tomasz Zielonka versions.  I ran the test with N=2500 (i.e. 2500
copies of the original file, which is what is apparently used in the
shootout) on my AthlonXP 1600 under x86/Linux.

6.2.1   6.3.20040928
--- ---
Kevin   3.615s  3.156s  
Kevin  (+RTS -G1)   1.666s  1.405s
Tomasz (pretty) 0.725s  0.481s
Tomasz (fast)   0.403s  0.430s

Interesting to see the speed increase going from 6.2.1 to 6.3 for
Tomasz' pretty example.  Anyone have an explaination for the 2x speed
increase for running Kevin's version with '+RTS -G1'?

(And for reference, here's the results on my machine for the perl and
gcc versions of the test and gnu/wc)

perl-5.8.4  0.535s
gcc-3.4.2   0.102s
gnu/wc  0.435s

Greg Buchholz

___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] OCaml list sees abysmal Language Shootout results

2004-09-30 Thread Georg Martius
Hi folks,
On Thu, 30 Sep 2004 01:02:54 +0100, Sam Mason <[EMAIL PROTECTED]> wrote:
Greg Buchholz wrote:
The algorithm isn't correct (it counts spaces instead of words), but
anyone have advice for improving its performance?
You probably want some strictness annotations in there. . .

Last night as I have tried to improve Gregs wc in a simple fashion and came up with 
the same idea to make a new data type with strict fields. I thought why one couldn't 
add some kind of strictness annotation to the function type. First attempt:
wc :: !(Int,Int,Int) -> Char -> (Int, Int, Int)
As far as I know the compiler does strictness analysis to find strict arguments in a 
function anyway. Would it make sense to allow this kind of explicit strictness? Where 
are the problems with that?
I mean lazyness is really useful and it is our best friend in this kind of 
application, since we can make stream processing without implementing a buffer and so 
on. On the other hand one gets occasionally traped by it and it is not allways easy to 
grasp why.
Some more general comment: The code for the shootout doesn't need to be extremly fast 
in my eyes, it needs to be elegant and reasonable at performance and memory 
consuptions (In this order). I don't want to say that Thomaszs solution is bad, but it 
is not a typical Haskell style application. If someone (not haskeller) looks at the 
implementation it should be very obvious and clear.
The last few weeks the list have been full of performance questions (including my own 
ones) and it is really a pitty that it is such an issue. I believe that most problems 
occuring with performance and memory consumptions could be easily solved by partial 
and explicit strictness. Please enlight me if I am wrong.
Regards,
 Georg
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] OCaml list sees abysmal Language Shootout results

2004-09-30 Thread Graham Klyne
At 19:39 30/09/04 +0200, Tomasz Zielonka wrote:
What I like about GHC is that I can start from simple, high-level,
sometimes slow solutions, but if there are efficiency problems, there is
a big chance that I can solve them without switching the language.
That's a very good point, I think.  One to hang on to.
> But I did wonder if it wouldn't be possible to also abstract out the I/O
> element of your 'fileIterate', using instead something like:
>   streamIterate :: [b] -> (a -> b -> a) -> a -> IO a
It seems to be a variant of foldl. You can eliminate IO from return
type, or is there some reason for it?
Doh!  (Why didn't I spot that?)  All roads lead to Rome, or something like 
that?  There seems to be a recurring tension between how much to specialize 
and how much to generalize.

Maybe it should be something like:
  streamIterate :: (Monad m) => [b] -> (a -> b -> m a) -> a -> m a
?
Er, but that's similarly a variation of foldM, right?
Or maybe my earlier idea was closer:
  streamIterate :: (Monad m) =>
  (c -> m (b,c)) -> c -> (a -> b -> m a) -> a -> m a
?
Hmmm... I feel like a (intellectual) bull-in-a-china-shop here.  I'm 
blundering about on the trail of a delicate and elegant idea that I'm sure 
others could dissect far more clearly.

What I'm trying to capture (I think) is that there's some baggage to do 
with accessing the raw data and emitting the desired result that needs to 
be carried along (or interleaved) with the required computation on that 
data.  Does this make any sense, or am I descending into farce here?

#g

Graham Klyne
For email:
http://www.ninebynine.org/#Contact
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] OCaml list sees abysmal Language Shootout results

2004-09-30 Thread Tomasz Zielonka
On Thu, Sep 30, 2004 at 05:40:58PM +0100, Graham Klyne wrote:
> 2. Your fileIterator certainly looks nicer (to me) than your other 
> solution, but...

It looks nicer to me too.

> Tagging along with this debate, I found myself wondering if, in order to 
> get performance comparable to other languages, it is really necessary to 
> write code like code in other languages.

Maybe it's not necessary to get the comparable performance, but it often
is if you want to get the best performance possible. Of course I
wouldn't mind if GHC optimised high level code so well, that we wouldn't
have to do it, but I guess it's just not that easy. 

What I like about GHC is that I can start from simple, high-level,
sometimes slow solutions, but if there are efficiency problems, there is
a big chance that I can solve them without switching the language.

> But I did wonder if it wouldn't be possible to also abstract out the I/O 
> element of your 'fileIterate', using instead something like:
>   streamIterate :: [b] -> (a -> b -> a) -> a -> IO a

It seems to be a variant of foldl. You can eliminate IO from return
type, or is there some reason for it?

Best regards,
Tom

-- 
.signature: Too many levels of symbolic links
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] OCaml list sees abysmal Language Shootout results

2004-09-30 Thread Graham Klyne
At 16:56 30/09/04 +0200, Tomasz Zielonka wrote:
Then how about a solution like this: I took your program but used
my fast fileIterate instead of ,,foldl over getContents''.
I also added {-# OPTIONS -funbox-strict-fields #-}, and played a bit
to get the best optimisations from GHC.
It's about 7 times faster this way, but it's still two times slower than
the solution I sent to shootout.
Devilish plan: Maybe we could have some variants of fileIterate in GHC's
libraries? ;->
Two responses:
1. I agree that providing the right kind of library functions (and material 
explaining how to use them) maybe a key to getting efficient code without 
losing high-level forms of expression.

2. Your fileIterator certainly looks nicer (to me) than your other 
solution, but...

Tagging along with this debate, I found myself wondering if, in order to 
get performance comparable to other languages, it is really necessary to 
write code like code in other languages.  E.g., I thought one of the 
lessons of John Hughes' "Why functional Programming matters" was that one 
can achieve greater efficiencies by climbing higher rather than dropping 
down to the level of other languages.  Your fileIterate looks to me like a 
step in the right direction.

But I did wonder if it wouldn't be possible to also abstract out the I/O 
element of your 'fileIterate', using instead something like:
  streamIterate :: [b] -> (a -> b -> a) -> a -> IO a

(I was originally thinking of something like:
  streamIterate :: (c -> (b,c)) -> c -> (a -> b -> a) -> a -> IO a
where the first argument is a function that takes a sequence generator and 
returns the next member of the sequence+new generator, and the 2nd arg is 
the initial generator.)

For such an approach to be useful, I think it would also be important to 
have variations of functions like length, lines, words that can be combined 
to make a function like your wc'.

#g

Graham Klyne
For email:
http://www.ninebynine.org/#Contact
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] OCaml list sees abysmal Language Shootout results]

2004-09-30 Thread Greg Buchholz
Sam Mason wrote:
> 
> You probably want some strictness annotations in there. . .

Now we're getting somewhere.  When I replace the tuples with my own
(strict) data structure, it gets about 7.5 times faster than the original
shootout example (or about 24 times faster than the version with
tuples).  I get another 2x speedup when I pass '+RTS -G1' to the
executable.  So the version below is about 15 times faster than the
original using the 3MB data file from the shootout. (Now we're only
about 40x slower than ocaml).  Don't forget to turn on '-funbox-strict-fields' 
for an additional improvement.

-- Compile with...
-- ghc -O2 -ddump-simpl -fvia-c -funbox-strict-fields -o wc_fold2 wc_fold2.hs
-- execute as...  ./wc_fold2 +RTS -G1 http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] OCaml list sees abysmal Language Shootout results

2004-09-30 Thread Tomasz Zielonka
On Thu, Sep 30, 2004 at 09:49:46AM -0400, Kevin Everets wrote:
> I took Georg's, fixed the word count logic and added prettier
> printing, and then combined it with Sam's main (which I find more
> elegant, but others may find less straightforward).  I think it
> strikes a good balance between efficiency and elegance.

Then how about a solution like this: I took your program but used
my fast fileIterate instead of ,,foldl over getContents''. 
I also added {-# OPTIONS -funbox-strict-fields #-}, and played a bit
to get the best optimisations from GHC.

It's about 7 times faster this way, but it's still two times slower than
the solution I sent to shootout.

Devilish plan: Maybe we could have some variants of fileIterate in GHC's
libraries? ;->

I remember that someone proposed similar functions on haskell's lists
some time ago, but can't remember who.

Best regards,
Tom

-- 
.signature: Too many levels of symbolic links

{-# OPTIONS -funbox-strict-fields #-}

import System.IO
import Data.Array.IO
import Data.Array.Base
import Data.Word
import Data.Int
import List
import Char

main = fileIterate stdin wc' (C 0 0 0 False) >>= putStrLn . showC

data C = C !Int !Int !Int !Bool deriving Show
-- Line Word Char InWord

showC (C l w c _) = show l ++ " " ++ show w ++ " " ++ show c

wc' :: C  -> Char -> C
wc' (C l w c _) '\n' = C (l+1) w (c+1) False
wc' (C l w c _) ' '  = C l w (c+1) False
wc' (C l w c _) '\t' = C l w (c+1) False
wc' (C l w c False) _= C l (w+1) (c+1) True
wc' (C l w c True)  _= C l w (c+1) True



{-# INLINE fileIterate #-}

fileIterate :: Handle -> (a -> Char -> a) -> a -> IO a
fileIterate h f a0 = do
buf <- newArray_ (0, bufSize - 1) :: IO (IOUArray Int Word8)
let loop i n a
| i `seq` n `seq` a `seq` False = undefined
| i == n =
do  n' <- hGetArray h buf bufSize
if n' == 0
then return a
else loop 0 n' a
| otherwise =
do  c <- fmap (toEnum . fromEnum) (readArray buf i)
loop (i + 1) n (f a c)
loop 0 0 a0
  where
bufSize :: Int
bufSize = 4096

___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] OCaml list sees abysmal Language Shootout results

2004-09-30 Thread Kevin Everets
On Thu, Sep 30, 2004 at 11:26:15AM +0100, Malcolm Wallace wrote:

> Just out of interest, I ran all of these suggested variations of
> the word count solution in Haskell head-to-head against each other.
> Here are the results, in seconds, on my machine (2.4GHz x86/Linux)
> for the suggested input (N=500) from the shootout site.  All Haskell
> versions were compiled with ghc-5.04.2 -O2.
> 
>   original space-leaky2.257
>   Greg Buchholz   1.619   *
>   Sam Mason   0.594
>   Malcolm Wallace 0.457
>   Georg Martius   0.322   *
>   Tomasz Zielonka 0.047
>   linux 'wc'  0.085
> 
> Those marked with a * gave the wrong number of words.  The really
> interesting thing is that Tomasz's solution is twice as fast as the
> standard Gnu implementation!

I took Georg's, fixed the word count logic and added prettier
printing, and then combined it with Sam's main (which I find more
elegant, but others may find less straightforward).  I think it
strikes a good balance between efficiency and elegance.

Cheers,

Kevin.

--

import IO

main = getContents >>= putStrLn . showC . foldl wc' (C 0 0 0 False)

data C = C !Int !Int !Int !Bool deriving Show
-- Line Word Char InWord

showC (C l w c _) = show l ++ " " ++ show w ++ " " ++ show c

wc' :: C  -> Char -> C
wc' (C l w c _) '\n' = C (l+1) w (c+1) False
wc' (C l w c _) ' '  = C l w (c+1) False
wc' (C l w c _) '\t' = C l w (c+1) False
wc' (C l w c False) _= C l (w+1) (c+1) True
wc' (C l w c True)  _= C l w (c+1) True
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] OCaml list sees abysmal Language Shootout results

2004-09-30 Thread Tomasz Zielonka
On Thu, Sep 30, 2004 at 11:26:15AM +0100, Malcolm Wallace wrote:
> Those marked with a * gave the wrong number of words.  The really
> interesting thing is that Tomasz's solution is twice as fast as the
> standard Gnu implementation!

That's probably because Gnu wc is locale aware.

Best regards,
Tom

-- 
.signature: Too many levels of symbolic links
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] OCaml list sees abysmal Language Shootout results

2004-09-30 Thread Malcolm Wallace
Just out of interest, I ran all of these suggested variations of
the word count solution in Haskell head-to-head against each other.
Here are the results, in seconds, on my machine (2.4GHz x86/Linux)
for the suggested input (N=500) from the shootout site.  All Haskell
versions were compiled with ghc-5.04.2 -O2.

original space-leaky2.257
Greg Buchholz   1.619   *
Sam Mason   0.594
Malcolm Wallace 0.457
Georg Martius   0.322   *
Tomasz Zielonka 0.047
linux 'wc'  0.085

Those marked with a * gave the wrong number of words.  The really
interesting thing is that Tomasz's solution is twice as fast as the
standard Gnu implementation!

Regards,
Malcolm
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] OCaml list sees abysmal Language Shootout results

2004-09-29 Thread Georg Martius
On Wed, 29 Sep 2004 15:58:24 -0700, Greg Buchholz <[EMAIL PROTECTED]> wrote:

Just for the heck of it, I'd thought I'd try to write a naive 1-pass
version of the program.  It turned out to be 4.5 times slower than the
original...
-- compiled with:  ghc -O2 -ddump-simpl -fvia-c  -o wc_fold wc_fold.hs
import IO
main = do   file <- getContents
putStrLn (show (foldl wc (0,0,0) file))
wc :: (Int,Int,Int) -> Char -> (Int, Int, Int)
wc (l,w,c) '\n' = (l+1,w  ,c+1)
wc (l,w,c) ' '  = (l  ,w+1,c+1)
wc (l,w,c)  x   = (l  ,w  ,c+1)
use strictness is the trick to exploit the tail-recursion. There is possibly a 
better way but:
data C = C !Int !Int !Int deriving Show
wc' :: C  -> Char -> C
wc' (C l w c) '\n' = C (l+1) w (c+1)
wc' (C l w c) ' '  = C l (w+1) (c+1)
wc' (C l w c)  x   = C l w (c+1)
is significant faster. The results on a 12MB file:
original:
1,699,541,396 bytes allocated in the heap
340,796,888 bytes copied during GC
 75,415,872 bytes maximum residency (8 sample(s))
146 Mb total memory in use
  Total time5.92s  (  6.09s elapsed)
wc: (yours)
  I have not enough memory :-(
wc':
535,286,872 bytes allocated in the heap
187,340,032 bytes copied during GC
135,696 bytes maximum residency (130 sample(s))
  2 Mb total memory in use
  Total time2.45s  (  2.53s elapsed)
Cheers,
 Georg
--
 Georg Martius,  Tel: (+49 34297) 89434 
--- http://www.flexman.homeip.net -
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] OCaml list sees abysmal Language Shootout results

2004-09-29 Thread Sam Mason
Greg Buchholz wrote:
>The algorithm isn't correct (it counts spaces instead of words), but
>anyone have advice for improving its performance?

You probably want some strictness annotations in there. . .  When I
tried the same thing, I came up with something like:

> import Char;
>
> cclass c | isSpace c = (c == '\n', False)
>  | otherwise = (False, True)
>
> data Cdata = Cdata !Bool !Int !Int !Int
>   deriving Show
>
> combine (Cdata last l w c) (nl, iw) = Cdata iw l' w' (c + 1)
> where l' = if nl then l + 1 else l
>   w' = if not last && iw then w + 1 else w
>
> wc = foldl combine (Cdata False 0 0 0) . map cclass
>
> main = getContents >>= putStrLn . show . wc

It seems to work in constant stack space, and gives the same answers
(albeit not very beautifully) as my GNU copy of "wc".

>Is the problem
>caused by the laziness of the Int's inside the tuple?

I'm pretty sure that's what's causing it.  I had quite a search around
when my version was running out of memory and everything seemed to
suggest that, basically, the algorithm is building a massive list of
"+1"s that only actually get executed when the you try and print the
totals at the end.

Any comments from more authoritative sources?

>Here is the
>report from ghc with the '-ddump-simpl' option.

If anyone has any hints about how to read this output, I'd be
grateful.  It makes a bit of sense, but I don't really know what it
"means".  I.e. it's obviously the simplified parse tree and I can see
how it relates back to the source (loosely), but attempting to figure
out if something's going to be as leaky as a sieve or not is beyond
me.
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] OCaml list sees abysmal Language Shootout results

2004-09-29 Thread Greg Buchholz
Malcolm Wallace wrote:
> Graham Klyne <[EMAIL PROTECTED]> writes:
> > I can see that this requires the original file to be kept for 3-time 
> > scanning,  so enough memory for the entire file will be required.  Is that 
> > *the* problem to which you allude?  I can't see any other problem 
> > here.
> 
> Yes, it is the main problem.

> In any case, for the shootout, this is patently a different algorithm
> to the one every other solution uses.  All the other languages do a
> simple one-pass traversal of the file, in constant memory space.  Why
> artificially handicap Haskell by insisting it do the job naively?

Just for the heck of it, I'd thought I'd try to write a naive 1-pass
version of the program.  It turned out to be 4.5 times slower than the
original...

-- compiled with:  ghc -O2 -ddump-simpl -fvia-c  -o wc_fold wc_fold.hs

import IO

main = do   file <- getContents
putStrLn (show (foldl wc (0,0,0) file))

wc :: (Int,Int,Int) -> Char -> (Int, Int, Int)
wc (l,w,c) '\n' = (l+1,w  ,c+1)
wc (l,w,c) ' '  = (l  ,w+1,c+1)
wc (l,w,c)  x   = (l  ,w  ,c+1)
 

The algorithm isn't correct (it counts spaces instead of words), but
anyone have advice for improving its performance?  Is the problem
caused by the laziness of the Int's inside the tuple?  Here is the
report from ghc with the '-ddump-simpl' option.

Main.wc :: (GHC.Base.Int, GHC.Base.Int, GHC.Base.Int)
   -> GHC.Base.Char -> (GHC.Base.Int, GHC.Base.Int,
GHC.Base.Int)
[GlobalId]
Arity 2 __P Main.$wwc NoCafRefs Str: DmdType U(LLL)U(L)m
Main.wc = __inline_me (\ w :: (GHC.Base.Int,
   GHC.Base.Int,
   GHC.Base.Int)
 w1 :: GHC.Base.Char ->
 case w of w2 { (ww, ww1, ww2) ->
 case w1 of w3 { GHC.Base.C# ww3 ->
 case Main.$wwc ww ww1 ww2 ww3 of ww4 { (# ww5,
ww6, ww7 #) ->
 (ww5, ww6, ww7)
 }
 }
 })

Rec {
Main.$wlgo :: GHC.Base.Int
  -> GHC.Base.Int
 -> GHC.Base.Int
-> [GHC.Base.Char]
   -> (# GHC.Base.Int, GHC.Base.Int, GHC.Base.Int #)




And here is the memory allocation report generated from passing 
'+RTS -M900m -k250m -sstderr' to the executable...

875,150,472 bytes allocated in the heap
149,008,464 bytes copied during GC
250,845,060 bytes maximum residency (2 sample(s))

478 collections in generation 0 ( 10.44s)
  2 collections in generation 1 (  0.00s)

813 Mb total memory in use

  INIT  time0.00s  (  0.00s elapsed)
  MUT   time0.93s  (  1.03s elapsed)
  GCtime   10.44s  ( 11.38s elapsed)
  EXIT  time0.00s  (  0.00s elapsed)
  Total time   11.37s  ( 12.41s elapsed)

  %GC time  91.8%  (91.7% elapsed)

  Alloc rate941,022,012 bytes per MUT second

  Productivity   8.2% of total user, 7.5% of total elapsed

Finally, here's the profiling report (-pT)...

Wed Sep 29 15:21 2004 Time and Allocation Profiling Report (Final)

   wc_fold +RTS -M800m -k250m -pT -RTS

total time  =1.22 secs   (61 ticks @ 20 ms)
total alloc = 173,502,056 bytes  (excludes profiling overheads)

COST CENTREMODULE   %time %alloc

wc Main  60.7   64.8
main   Main  39.3   35.2


  individual inherited
COST CE MODULE   no.entries  %time %alloc %time %alloc

MAINMAIN   1   0   0.00.0 100.0  100.0
 main   Main 146   1  39.3   35.2 100.0  100.0
  wcMain 147 3048000  60.7   64.8 60.7   64.8
 CAFGHC.Handle   103   3   0.00.0 0.00.0
 CAFSystem.Posix.Internals81   4   0.00.0 0.00.0



Any hints appreciated,

Greg Buchholz

___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] OCaml list sees abysmal Language Shootout results

2004-09-29 Thread Malcolm Wallace
Graham Klyne <[EMAIL PROTECTED]> writes:

> > main = do file <- getContents
> >   putStrLn $ show (length $ lines file) ++ " " ++
> >  show (length $ words file) ++ " " ++
> >  show (length file)
> >
> >Space-leak or what?
> 
> I can see that this requires the original file to be kept for 3-time 
> scanning,  so enough memory for the entire file will be required.  Is that 
> *the* problem to which you allude?  I can't see any other problem 
> here.

Yes, it is the main problem.  Don't forget, the shootout benchmark
runs this example over a very large input (15Mb).  Since the
character-list stored in memory for this file takes approximately 12
bytes per character, that blows up to about 180Mb to store temporarily.
The shootout performance figures reckon that ghc actually uses 223Mb
in total.

>  And why would this put Haskell at a disadvantage?

Large live heap space means a large time spent in GC, trying to find
the needle that is actual garbage in the haystack of live pointers.
It also increases the likelihood of cache misses and all kinds of
other bad memory effects.  In other words, wasting space is wasting
time.  There is a good literature on heap profiling in Haskell which
demonstrates the benefits of keeping space usage small to improve
time performance.

In any case, for the shootout, this is patently a different algorithm
to the one every other solution uses.  All the other languages do a
simple one-pass traversal of the file, in constant memory space.  Why
artificially handicap Haskell by insisting it do the job naively?

Regards,
Malcolm
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] OCaml list sees abysmal Language Shootout results

2004-09-29 Thread Tomasz Zielonka
On Wed, Sep 29, 2004 at 01:41:03PM +0100, Graham Klyne wrote:
> >With code like this, I'm not surprised!
> >
> >main = do file <- getContents
> >  putStrLn $ show (length $ lines file) ++ " " ++
> > show (length $ words file) ++ " " ++
> > show (length file)
> >
> >Space-leak or what?
> 
> Er, please excuse a dumb question, but I'm struggling to see the problem 
> here.
> 
> I can see that this requires the original file to be kept for 3-time 
> scanning,  so enough memory for the entire file will be required.

It would be nice if these scans were performed concurrently in a way
that would make memory usage constant, wouldn't it? ;)

Hmmm... maybe some simple tracking of garbage collection results would
suffice... Reschedule if the current thread doesn't help in collecting
garbage... But I am dreaming now... :)

> Is that *the* problem to which you allude?  I can't see any other
> problem here.  And why would this put Haskell at a disadvantage?

The only problem is that some people may draw incorrect conclusions.
Should we care? I already submitted two improvements for shootout in
the last two days (not included yet), but I don't know if it's worth
the effort.

I remember SPJ's motto: ,,Avoid success at all cost''. Is this motto
still valid?


http://research.microsoft.com/Users/simonpj/papers/haskell-retrospective/HaskellRetrospective-2.pdf

Best regards,
Tom

-- 
.signature: Too many levels of symbolic links
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] OCaml list sees abysmal Language Shootout results

2004-09-29 Thread Graham Klyne
At 10:55 28/09/04 +0100, Malcolm Wallace wrote:
Keith Wansbrough <[EMAIL PROTECTED]> writes:
>   I can't believe that a simple "wc" implementation should be
> 570 times slower in Haskell than OCaml - could someone investigate and
> fix the test?
With code like this, I'm not surprised!
main = do file <- getContents
  putStrLn $ show (length $ lines file) ++ " " ++
 show (length $ words file) ++ " " ++
 show (length file)
Space-leak or what?
Er, please excuse a dumb question, but I'm struggling to see the problem here.
I can see that this requires the original file to be kept for 3-time 
scanning,  so enough memory for the entire file will be required.  Is that 
*the* problem to which you allude?  I can't see any other problem 
here.  And why would this put Haskell at a disadvantage?

#g

Graham Klyne
For email:
http://www.ninebynine.org/#Contact
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] OCaml list sees abysmal Language Shootout results

2004-09-28 Thread Henning Thielemann

On Tue, 28 Sep 2004, Tomasz Zielonka wrote:

> Changed readArray to unsafeRead, and it is 47 times faster now.
> 
> I must say I am pleasantly surprised that GHC managed to unbox
> everything there was to unbox without much annotations. For 5MB file the
> program allocated only 192KB in the heap. Especially optimisation of
> higher-level constructs like 'fmap (toEnun . fromEnum) ...' is very
> nice.

Now I like to see an implementation which is both elegant and fast ... 

:-)

___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] OCaml list sees abysmal Language Shootout results

2004-09-28 Thread Tomasz Zielonka
On Tue, Sep 28, 2004 at 12:49:52PM +0200, Tomasz Zielonka wrote:
> On Tue, Sep 28, 2004 at 12:01:11PM +0200, Tomasz Zielonka wrote:
> > On Tue, Sep 28, 2004 at 10:46:14AM +0100, Keith Wansbrough wrote:
> > > I just saw this on the OCaml list (in a posting by "Rafael 'Dido' 
> > > Sevilla" <[EMAIL PROTECTED]> in the "Observations on OCaml vs. Haskell" 
> > > thread).  I can't believe that a simple "wc" implementation should be 
> > > 570 times slower in Haskell than OCaml - could someone investigate and 
> > > fix the test?
> > 
> > No wonder it is so slow, this program looks as a result of some ,,as
> > slow as possible'' contest ;)
> 
> It took me half an hour to make a version which is 41 times faster
> on a 5MB file. It should be possible to make it even 2-3 times faster
> than this.

Changed readArray to unsafeRead, and it is 47 times faster now.

I must say I am pleasantly surprised that GHC managed to unbox
everything there was to unbox without much annotations. For 5MB file the
program allocated only 192KB in the heap. Especially optimisation of
higher-level constructs like 'fmap (toEnun . fromEnum) ...' is very
nice.

Code attached. Feel free to improve it.

Best regards,
Tom

-- 
.signature: Too many levels of symbolic links

import System.IO
import Data.Array.IO
import Data.Array.Base (unsafeRead)
import Data.Word
import Char
import List

wc :: Handle -> IO (Int, Int, Int)
wc h = do
buf <- newArray_ (0, bufSize - 1) :: IO (IOUArray Int Word8)
let
wcLoop :: Char -> Int -> Int -> Int -> Int -> Int -> IO (Int, Int, Int)
wcLoop prev nl nw nc i n 
| prev `seq` nl `seq` nw `seq` nc `seq` i `seq` n `seq` False =
undefined
| i == n =
do  n' <- hGetArray h buf bufSize
if n' == 0
then return (nl, nw, nc)
else wcLoop prev nl nw nc 0 n'
| otherwise =
do  c <- fmap (toEnum . fromEnum) (unsafeRead buf i)
wcLoop
c
(nl + if c == '\n' then 1 else 0)
(nw + if not (isSpace c) && isSpace prev then 1 else 0)
(nc + 1)
(i + 1)
n
wcLoop ' ' 0 0 0 0 0
  where
bufSize :: Int
bufSize = 8192

main = do
(nl, nw, nc) <- wc stdin
putStrLn $ concat $ intersperse " " $ map show [nl, nw, nc]

___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] OCaml list sees abysmal Language Shootout results

2004-09-28 Thread Tomasz Zielonka
On Tue, Sep 28, 2004 at 12:01:11PM +0200, Tomasz Zielonka wrote:
> On Tue, Sep 28, 2004 at 10:46:14AM +0100, Keith Wansbrough wrote:
> > I just saw this on the OCaml list (in a posting by "Rafael 'Dido' 
> > Sevilla" <[EMAIL PROTECTED]> in the "Observations on OCaml vs. Haskell" 
> > thread).  I can't believe that a simple "wc" implementation should be 
> > 570 times slower in Haskell than OCaml - could someone investigate and 
> > fix the test?
> 
> No wonder it is so slow, this program looks as a result of some ,,as
> slow as possible'' contest ;)

It took me half an hour to make a version which is 41 times faster
on a 5MB file. It should be possible to make it even 2-3 times faster
than this.

Best regards,
Tom

-- 
.signature: Too many levels of symbolic links
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] OCaml list sees abysmal Language Shootout results

2004-09-28 Thread Tomasz Zielonka
On Tue, Sep 28, 2004 at 10:46:14AM +0100, Keith Wansbrough wrote:
> I just saw this on the OCaml list (in a posting by "Rafael 'Dido' 
> Sevilla" <[EMAIL PROTECTED]> in the "Observations on OCaml vs. Haskell" 
> thread).  I can't believe that a simple "wc" implementation should be 
> 570 times slower in Haskell than OCaml - could someone investigate and 
> fix the test?

No wonder it is so slow, this program looks as a result of some ,,as
slow as possible'' contest ;)

 main = do file <- getContents
   putStrLn $ show (length $ lines file) ++ " " ++
  show (length $ words file) ++ " " ++
  show (length file)

Best regards,
Tom

-- 
.signature: Too many levels of symbolic links
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] OCaml list sees abysmal Language Shootout results

2004-09-28 Thread Malcolm Wallace
Keith Wansbrough <[EMAIL PROTECTED]> writes:

>   I can't believe that a simple "wc" implementation should be 
> 570 times slower in Haskell than OCaml - could someone investigate and 
> fix the test?

With code like this, I'm not surprised!

main = do file <- getContents
  putStrLn $ show (length $ lines file) ++ " " ++
 show (length $ words file) ++ " " ++
 show (length file)

Space-leak or what?
Regards,
Malcolm
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] OCaml list sees abysmal Language Shootout results

2004-09-28 Thread Keith Wansbrough
I just saw this on the OCaml list (in a posting by "Rafael 'Dido' 
Sevilla" <[EMAIL PROTECTED]> in the "Observations on OCaml vs. Haskell" 
thread).  I can't believe that a simple "wc" implementation should be 
570 times slower in Haskell than OCaml - could someone investigate and 
fix the test?

--KW 8-)

http://caml.inria.fr/archives/200409/msg00485.html

> > 2. Haskell strings are lists of characters
> > 
> > It's annoying that strings aren't normally processed this way in OCaml, 
> > and even more annoying that (^) or (::) cannot be used in pattern 
> > matching over strings.  I like Haskell's approach.  The list 
> > concatenation operator is the same as the string concatenation operator 
> > in Haskell.
> > 
> 
> This is something of an efficiency/elegance tradeoff.  Making strings
> act like lists means potentially boxing *every character in the string*.
> In other words, it's potentially a very expensive way of doing business.
> Paul Graham was mulling over this kind of tradeoff in his design of Arc,
> as I recall.  Another language that does this type of thing is Erlang,
> and both languages seem to be significantly slower than OCaml in string
> handling, at least as far as this site goes:
> 
> http://shootout.alioth.debian.org/
> 
> For the word count benchmark OCaml scores 0.1850 seconds, while GHC is a
> dismal last place at 105.2110 seconds!  Even the bytecode ocaml is an
> order of magnitude faster.  The word frequency benchmark also shows this
> kind of poor string handling performance for Haskell, with OCaml scoring
> 0.5669 seconds, while GHC scores a truly dismal 6.4540, more than an
> order of magnitude slower, and even the bytecode ocaml is faster at
> 4.2644 seconds.
> 
> All in all, it would appear that Haskell's approach has been expensive
> in terms of performance, if the benchmarks are to be taken at face
> value.  Such are the tradeoffs language designers have to make.


___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe