subject:"\[Haskell\-cafe\] Slow IO"

Re: [Haskell-cafe] Slow IO?

2009-09-01 Thread Mark Wotton



On 02/09/2009, at 2:26 PM, Eugene Kirpichov wrote:





I've got a Centrino Duo 2000 (I'm on a notebook), Ubuntu 9.04 and  
ghc 6.10.2.


However, we have not set up on what exact input file we're using :)
I'm using one where it is written "1000 3" and then 1000 lines
of "9" follow.

Also, I wonder what one'd get if one compiled this program with jhc,
but I don't know whether jhc is able to compile Data.ByteString.


It couldn't last time I tried - choked on some INLINE pragmas. Might  
not be a massive job, but there aren't enough hours in the day...


mark
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO?

2009-09-01 Thread Steve

On Wed, 2009-09-02 at 11:55 +0800, Steve wrote:
> On Tue, 2009-09-01 at 08:45 +0400, Eugene Kirpichov wrote:
> > Hm, on my machine Don's code has exactly the same performance my code above.
> That's strange.
> 
> > Also, replacing the 'test' and 'parse' functions with this one
> > 
> > add :: Int -> Int -> S.ByteString -> Int
> > add k i s = fst $ S.foldl' f (i, 0) s
> >   where f (!i, !n) '\n' | n`divisibleBy`k = (i+1, 0)
> > | otherwise   = (i,   0)
> > f (!i, !n) w  = (i, 10*n+ord w-ord '0')
> > 
> > increases performance by another 15% (0.675s vs 0.790s)
> 
> On my system I get a 50% slowdown using this add function!
> 
> I guess is just shows that benchmarking code on one single
> CPU/memory/OS/ghc combination does not give results that apply widely.
> I'm using:
> AMD Athlon X2 4800
> 2GB memory
> Linux (Fedora 11, 64-bit version)
> ghc 6.10.3

I should have also said that the test method and test data is important
too.

This is what I have been using:
$ time ./0450 < 0450.input.data
and looking at the 'real' value.

The file 0450.input.data is generated with a Python script:
#!/usr/bin/env python
'''
generate a data file for problem 0450
'''

from __future__ import division# new in 2.2, redundant in 3.0
from __future__ import absolute_import # new in 2.5, redundant in
2.7/3.0
from __future__ import print_function  # new in 2.6, redundant in 3.0

import io
import random

inFile = '0450.input.data'
#n, k, tiMax = 10**6, 3, 10**9
n, k, tiMax = 10**7, 3, 10**9

with io.open(inFile, 'wb') as f:
  f.write('%d %d\n' % (n, k))
  for i in xrange(n):
ti = random.randint(1, tiMax)
f.write('%d\n' % (ti,))


Steve

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO?

2009-09-01 Thread Eugene Kirpichov

2009/9/2 Steve :
> On Tue, 2009-09-01 at 08:45 +0400, Eugene Kirpichov wrote:
>> Hm, on my machine Don's code has exactly the same performance my code above.
> That's strange.
>
>> Also, replacing the 'test' and 'parse' functions with this one
>>
>> add :: Int -> Int -> S.ByteString -> Int
>> add k i s = fst $ S.foldl' f (i, 0) s
>>   where f (!i, !n) '\n' | n`divisibleBy`k = (i+1, 0)
>>                         | otherwise       = (i,   0)
>>         f (!i, !n) w  = (i, 10*n+ord w-ord '0')
>>
>> increases performance by another 15% (0.675s vs 0.790s)
>
> On my system I get a 50% slowdown using this add function!
>
> I guess is just shows that benchmarking code on one single
> CPU/memory/OS/ghc combination does not give results that apply widely.
> I'm using:
> AMD Athlon X2 4800
> 2GB memory
> Linux (Fedora 11, 64-bit version)
> ghc 6.10.3
>

I've got a Centrino Duo 2000 (I'm on a notebook), Ubuntu 9.04 and ghc 6.10.2.

However, we have not set up on what exact input file we're using :)
I'm using one where it is written "1000 3" and then 1000 lines
of "9" follow.

Also, I wonder what one'd get if one compiled this program with jhc,
but I don't know whether jhc is able to compile Data.ByteString.

> Steve
>
>



-- 
Eugene Kirpichov
Web IR developer, market.yandex.ru
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO?

2009-09-01 Thread Steve

On Tue, 2009-09-01 at 08:45 +0400, Eugene Kirpichov wrote:
> Hm, on my machine Don's code has exactly the same performance my code above.
That's strange.

> Also, replacing the 'test' and 'parse' functions with this one
> 
> add :: Int -> Int -> S.ByteString -> Int
> add k i s = fst $ S.foldl' f (i, 0) s
>   where f (!i, !n) '\n' | n`divisibleBy`k = (i+1, 0)
> | otherwise   = (i,   0)
> f (!i, !n) w  = (i, 10*n+ord w-ord '0')
> 
> increases performance by another 15% (0.675s vs 0.790s)

On my system I get a 50% slowdown using this add function!

I guess is just shows that benchmarking code on one single
CPU/memory/OS/ghc combination does not give results that apply widely.
I'm using:
AMD Athlon X2 4800
2GB memory
Linux (Fedora 11, 64-bit version)
ghc 6.10.3

Steve

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO?

2009-09-01 Thread Jason Dusek

  I've uploaded a new version of bytestring-nums that, while
  still slower than the fast/custom codes, allows Eugene's
  earlier program to a little more than 20% faster than it did
  before. It no longer handles spurious characters in the input
  by skipping over them (this is probably not a common
  requirement, anyways).

http://hackage.haskell.org/package/bytestring-nums-0.3.0

  I suspect that splitting the string into pieces and then
  mapping the parser over the pieces will never be faster than
  an all-in-one parser/tester/incrementer like the fast programs
  have.

--
Jason Dusek
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO?

2009-09-01 Thread Jason Dusek

2009/08/31 Don Stewart :
> If you can abstract out a common function for lexing ints out of
> bytestrings, we could add it to the bytestring-lexing package.

  All the really performant implementations operate on strings
  with multiple ints in them; I suspect this reduces memory
  traffic -- and indeed, Eugene's code using my libs allocates
  about twice as much memory as Don's code.

  I've tried a few different things with strictness annotations
  to no avail.

  I'm having some trouble understanding the meaning of "entries"
  in the profiler's output. I have a file with 5 million random
  integers in it, totalling 26210408 bytes (21210408 bytes of
  which are not newlines). The relevant part is here:

COST CENTRE  MODULE entries

MAIN MAIN 0
 mainMain 1
  bint   Main   501
   lazy_int  Data.ByteString.Nums.Careless.Int 41211385
digitize Data.ByteString.Nums.Careless.Int 21210408

  The number of "entries" to `lazy_int` is puzzling. Eugene's
  `bint` is called for each line of the file -- once for the
  header and then 5 million times for each of the integers.
  (There are two numbers on the first line but Eugene's program
  only uses `k` so `bint` is only actually entered once.)
  However, `bint` just calls my `int` and `int` calls `lazy_int`
  so why are there 41 million plus "entries" of `lazy_int`?

--
Jason Dusek


spoj-eugene-prof-opt-bang-acc-scc-dfold.prof
Description: Binary data
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO?

2009-08-31 Thread Don Stewart

Good work guys.

If you can abstract out a common function for lexing ints out of
bytestrings, we could add it to the bytestring-lexing package.

ekirpichov:
> Hm, on my machine Don's code has exactly the same performance my code above.
> 
> Also, replacing the 'test' and 'parse' functions with this one
> 
> add :: Int -> Int -> S.ByteString -> Int
> add k i s = fst $ S.foldl' f (i, 0) s
>   where f (!i, !n) '\n' | n`divisibleBy`k = (i+1, 0)
> | otherwise   = (i,   0)
> f (!i, !n) w  = (i, 10*n+ord w-ord '0')
> 
> increases performance by another 15% (0.675s vs 0.790s)
> 
> 2009/9/1 Jason Dusek :
> >  I've updated Don Stewart's solution to compile with the modern
> >  ByteString libs. I'll be looking at ways to improve the
> >  performance of the `bytestring-nums` package.
> >
> > --
> > Jason Dusek
> >
> >
> > http://github.com/jsnx/bytestring-nums/blob/d7de9db83e44ade9958fb3bfad0b29ede065b5dd/SPOJDons.hs
> > ___
> > Haskell-Cafe mailing list
> > Haskell-Cafe@haskell.org
> > http://www.haskell.org/mailman/listinfo/haskell-cafe
> >
> 
> 
> 
> -- 
> Eugene Kirpichov
> Web IR developer, market.yandex.ru
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
> 
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO?

2009-08-31 Thread Eugene Kirpichov

Hm, on my machine Don's code has exactly the same performance my code above.

Also, replacing the 'test' and 'parse' functions with this one

add :: Int -> Int -> S.ByteString -> Int
add k i s = fst $ S.foldl' f (i, 0) s
  where f (!i, !n) '\n' | n`divisibleBy`k = (i+1, 0)
| otherwise   = (i,   0)
f (!i, !n) w  = (i, 10*n+ord w-ord '0')

increases performance by another 15% (0.675s vs 0.790s)

2009/9/1 Jason Dusek :
>  I've updated Don Stewart's solution to compile with the modern
>  ByteString libs. I'll be looking at ways to improve the
>  performance of the `bytestring-nums` package.
>
> --
> Jason Dusek
>
>
> http://github.com/jsnx/bytestring-nums/blob/d7de9db83e44ade9958fb3bfad0b29ede065b5dd/SPOJDons.hs
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>



-- 
Eugene Kirpichov
Web IR developer, market.yandex.ru
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO?

2009-08-31 Thread Jason Dusek

  I've updated Don Stewart's solution to compile with the modern
  ByteString libs. I'll be looking at ways to improve the
  performance of the `bytestring-nums` package.

--
Jason Dusek


http://github.com/jsnx/bytestring-nums/blob/d7de9db83e44ade9958fb3bfad0b29ede065b5dd/SPOJDons.hs
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO?

2009-08-31 Thread Steve

On Sun, 2009-08-30 at 18:50 +0400, Eugene Kirpichov wrote:
> Thanks :) I wonder why SPOJ didn't accept the same thing from me.
> 
> I think that in order to obtain even higher performance we need to
> resort to low-level IO: raw reading into a byte buffer and parsing the
> very buffer to avoid memcpy'ing.
> Or, better, to use Oleg's iteratees with a file handle enumerator.
> I'll probably give it a try when I have time, but there's a 70% chance
> that I won't, so someone please try it, it should work :)

I just discovered that the SPOJ question regarding the problem
http://www.spoj.pl/problems/INTEST/
had already been asked about 2 years ago.
http://groups.google.com/group/fa.haskell/browse_thread/thread/4133fa71ce97eb0e/fef34d1c3943bbe0#fef34d1c3943bbe0

Donald Stewart gave a solution - long, complex and highly optimised
using knowledge of Data.ByteString internals. But fast - 2 or 3 times as
fast as your method.

Steve

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO?

2009-08-31 Thread Jason Dusek

2009/08/30 Eugene Kirpichov :
> Here's my version that works in 0.7s for me for a file with 10^7
> "9"'s but for some reason gets a 'wrong answer' at SPOJ :)

  Maybe it gets a wrong answer because it reads all the input,
  regardless of `n`.

--
Jason Dusek
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO?

2009-08-30 Thread Eugene Kirpichov

Bang patterns are easy: they make a part of the pattern strict by seq'ing it.

f (x, !y) = ...
~
f (x, y) = y `seq` ...

2009/8/30 Steve :
> On Sun, 2009-08-30 at 16:34 +0400, Eugene Kirpichov wrote:
>> Here's my version that works in 0.7s for me for a file with 10^7
>> "9"'s but for some reason gets a 'wrong answer' at SPOJ :)
>>
>> {-# LANGUAGE BangPatterns #-}
>> module Main where
>>
>> import qualified Data.ByteString.Lazy as B
>> import Data.Word
>>
>> answer :: Int -> B.ByteString -> Int
>> answer k = fst . B.foldl' f (0, 0)
>>   where f :: (Int,Int) -> Word8 -> (Int,Int)
>>         f (!countSoFar, !x) 10
>>           | x`mod`k==0 = (countSoFar+1, 0)
>>           | otherwise  = (countSoFar,   0)
>>         f (!countSoFar, !x) c = (countSoFar, 10*x+(fromIntegral c)-48)
>>
>> readInt :: B.ByteString -> Int
>> readInt = B.foldl' (\x c -> 10*x+fromIntegral c-48) 0
>>
>> main = do
>>   (line, rest) <- B.break (==10) `fmap` B.getContents
>>   let [n, k] = map readInt . B.split 32 $ line
>>   putStrLn . show $ answer k rest - 1
>
> Eugene,
> I ran your code on one of my test files and it gave the same answer as
> my code. So I submitted it and it was accepted. Its fast - twice as fast
> as my solution, using much less memory. Overall its the 4th fastest
> Haskell solution. (but its still 10 * slower than C/C++)
> I'll have to read up on BangPatterns to try to understand what its
> doing!
>
> I submitted it as:
>
> {-# LANGUAGE BangPatterns #-}
> module Main where
>
> import qualified Data.ByteString.Lazy as B
> import qualified Data.Word            as DW
>
> answer :: Int -> B.ByteString -> Int
> answer k = fst . B.foldl' f (0, 0)
>  where f :: (Int,Int) -> DW.Word8 -> (Int,Int)
>        f (!countSoFar, !x) 10
>          | x`mod`k==0 = (countSoFar+1, 0)
>          | otherwise  = (countSoFar,   0)
>        f (!countSoFar, !x) c = (countSoFar, 10*x+(fromIntegral c)-48)
>
> readInt :: B.ByteString -> Int
> readInt = B.foldl' (\x c -> 10*x+fromIntegral c-48) 0
>
> main :: IO ()
> main = do
>  (line, rest) <- B.break (==10) `fmap` B.getContents
>  let [_, k] = map readInt . B.split 32 $ line
>  putStrLn . show $ answer k rest - 1
>
>
>



-- 
Eugene Kirpichov
Web IR developer, market.yandex.ru
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO?

2009-08-30 Thread Eugene Kirpichov

Thanks :) I wonder why SPOJ didn't accept the same thing from me.

I think that in order to obtain even higher performance we need to
resort to low-level IO: raw reading into a byte buffer and parsing the
very buffer to avoid memcpy'ing.
Or, better, to use Oleg's iteratees with a file handle enumerator.
I'll probably give it a try when I have time, but there's a 70% chance
that I won't, so someone please try it, it should work :)

2009/8/30 Steve :
> On Sun, 2009-08-30 at 16:34 +0400, Eugene Kirpichov wrote:
>> Here's my version that works in 0.7s for me for a file with 10^7
>> "9"'s but for some reason gets a 'wrong answer' at SPOJ :)
>>
>> {-# LANGUAGE BangPatterns #-}
>> module Main where
>>
>> import qualified Data.ByteString.Lazy as B
>> import Data.Word
>>
>> answer :: Int -> B.ByteString -> Int
>> answer k = fst . B.foldl' f (0, 0)
>>   where f :: (Int,Int) -> Word8 -> (Int,Int)
>>         f (!countSoFar, !x) 10
>>           | x`mod`k==0 = (countSoFar+1, 0)
>>           | otherwise  = (countSoFar,   0)
>>         f (!countSoFar, !x) c = (countSoFar, 10*x+(fromIntegral c)-48)
>>
>> readInt :: B.ByteString -> Int
>> readInt = B.foldl' (\x c -> 10*x+fromIntegral c-48) 0
>>
>> main = do
>>   (line, rest) <- B.break (==10) `fmap` B.getContents
>>   let [n, k] = map readInt . B.split 32 $ line
>>   putStrLn . show $ answer k rest - 1
>
> Eugene,
> I ran your code on one of my test files and it gave the same answer as
> my code. So I submitted it and it was accepted. Its fast - twice as fast
> as my solution, using much less memory. Overall its the 4th fastest
> Haskell solution. (but its still 10 * slower than C/C++)
> I'll have to read up on BangPatterns to try to understand what its
> doing!
>
> I submitted it as:
>
> {-# LANGUAGE BangPatterns #-}
> module Main where
>
> import qualified Data.ByteString.Lazy as B
> import qualified Data.Word            as DW
>
> answer :: Int -> B.ByteString -> Int
> answer k = fst . B.foldl' f (0, 0)
>  where f :: (Int,Int) -> DW.Word8 -> (Int,Int)
>        f (!countSoFar, !x) 10
>          | x`mod`k==0 = (countSoFar+1, 0)
>          | otherwise  = (countSoFar,   0)
>        f (!countSoFar, !x) c = (countSoFar, 10*x+(fromIntegral c)-48)
>
> readInt :: B.ByteString -> Int
> readInt = B.foldl' (\x c -> 10*x+fromIntegral c-48) 0
>
> main :: IO ()
> main = do
>  (line, rest) <- B.break (==10) `fmap` B.getContents
>  let [_, k] = map readInt . B.split 32 $ line
>  putStrLn . show $ answer k rest - 1
>
>
>



-- 
Eugene Kirpichov
Web IR developer, market.yandex.ru
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO?

2009-08-30 Thread Steve

On Sun, 2009-08-30 at 16:34 +0400, Eugene Kirpichov wrote:
> Here's my version that works in 0.7s for me for a file with 10^7
> "9"'s but for some reason gets a 'wrong answer' at SPOJ :)
> 
> {-# LANGUAGE BangPatterns #-}
> module Main where
> 
> import qualified Data.ByteString.Lazy as B
> import Data.Word
> 
> answer :: Int -> B.ByteString -> Int
> answer k = fst . B.foldl' f (0, 0)
>   where f :: (Int,Int) -> Word8 -> (Int,Int)
> f (!countSoFar, !x) 10
>   | x`mod`k==0 = (countSoFar+1, 0)
>   | otherwise  = (countSoFar,   0)
> f (!countSoFar, !x) c = (countSoFar, 10*x+(fromIntegral c)-48)
> 
> readInt :: B.ByteString -> Int
> readInt = B.foldl' (\x c -> 10*x+fromIntegral c-48) 0
> 
> main = do
>   (line, rest) <- B.break (==10) `fmap` B.getContents
>   let [n, k] = map readInt . B.split 32 $ line
>   putStrLn . show $ answer k rest - 1

Eugene,
I ran your code on one of my test files and it gave the same answer as
my code. So I submitted it and it was accepted. Its fast - twice as fast
as my solution, using much less memory. Overall its the 4th fastest
Haskell solution. (but its still 10 * slower than C/C++)
I'll have to read up on BangPatterns to try to understand what its
doing!

I submitted it as:

{-# LANGUAGE BangPatterns #-}
module Main where

import qualified Data.ByteString.Lazy as B
import qualified Data.Wordas DW

answer :: Int -> B.ByteString -> Int
answer k = fst . B.foldl' f (0, 0)
  where f :: (Int,Int) -> DW.Word8 -> (Int,Int)
f (!countSoFar, !x) 10
  | x`mod`k==0 = (countSoFar+1, 0)
  | otherwise  = (countSoFar,   0)
f (!countSoFar, !x) c = (countSoFar, 10*x+(fromIntegral c)-48)

readInt :: B.ByteString -> Int
readInt = B.foldl' (\x c -> 10*x+fromIntegral c-48) 0

main :: IO ()
main = do
  (line, rest) <- B.break (==10) `fmap` B.getContents
  let [_, k] = map readInt . B.split 32 $ line
  putStrLn . show $ answer k rest - 1


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: Re[4]: [Haskell-cafe] Slow IO?

2009-08-30 Thread Peter Verswyvelen

On Sun, Aug 30, 2009 at 2:51 PM, Bulat Ziganshin
wrote:

> these all are different things, and talking about ByteString IO speed
> is the same as talking of speed of red cars
>

Okay, but statistically, most red cars are very fast Ferrari's no? :-)
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re[4]: [Haskell-cafe] Slow IO?

2009-08-30 Thread Bulat Ziganshin

Hello Peter,

Sunday, August 30, 2009, 4:36:55 PM, you wrote:

>> I compared the top 10 C/C++ results against the top 10 Haskell results:
> So to me it seems he's not talking about his code.

well, he talks about 20 programs

> Anyway, I thought Haskell's ByteString IO should not be that much slower 
> anyway.

what you mean by ByteString IO speed? speed of reading 100 gb file?
speed of reading 1 byte? speed of readInt? speed of those 10 programs?
these all are different things, and talking about ByteString IO speed
is the same as talking of speed of red cars

-- 
Best regards,
 Bulatmailto:bulat.zigans...@gmail.com

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: Re[2]: [Haskell-cafe] Slow IO?

2009-08-30 Thread Steve

On Sun, 2009-08-30 at 16:15 +0400, Bulat Ziganshin wrote: 
> Hello Steve,
> 
> Sunday, August 30, 2009, 3:54:53 PM, you wrote:
> 
> > So it looks like Haskell is ~13 slower for IO than C/C++, even (I
> > assume) when using Data.ByteString or other speed-up tricks.
> 
> it means that *your* program is 13x slower than C one and nothing
> more. in particular, your program may be constrained by readInt
> speed
> 

No, not at all. I did not count my program when comparing C/C++
to Haskell. I was counting the the *top 10* programs (submitted by
everybody) in the "Best Solutions" list. So its a general survey of
the best haskell solutions against the best C/C++ solutions.

Steve

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO?

2009-08-30 Thread Steve

On Sun, 2009-08-30 at 14:40 +0400, Eugene Kirpichov wrote:
Thanks, that works nicely too. However, I believe its not a standard
package, so I don't think it can be used for Sphere Online problems.

I timed a test run on a 10MB file and its a little slower than my
solution with the ByteString readInt improvement.

Steve

> module Main where
> 
> import qualified Data.ByteString.Lazy as B
> import Data.ByteString.Nums.Careless -- from bytestring-nums package
> 
> bint :: B.ByteString -> Int
> bint = int
> 
> main = do
>   line : rest <- B.split 10 `fmap` B.getContents
>   let [n, k] = map int . B.split 32 $ line
>   putStrLn . show . length . tail . filter ((==0).(`mod`k).bint)  $ rest
> 
> This does a 100MB file in 2.7s (probably because the file is cached by
> the filesystem).
> 
> 2009/8/30 Steve :
> > Hi,
> > I'm tackling a Sphere Online Judge tutorial question where it tests how
> > fast you can process input data. You need to achieve at least 2.5MB of
> > input data per second at runtime (on an old machine running ghc 6.6.1).
> > This is probably close to the limit of Haskell's ability.
> >
> > https://www.spoj.pl/problems/INTEST/
> >
> > I can see that 24 haskell programmers have solved it, but most are very
> > close to the 8 secs limit (and 6/24 are even over the limit!).
> >
> > Here's my code. It fails with a "time limit exceeded" error. (I think it
> > would calculate the correct result, eventually).
> >
> > module Main where
> >
> > import qualified Data.List as DLi
> > import qualified System.IO as SIO
> >
> > main :: IO ()
> > main = do
> >  line1 <- SIO.hGetLine SIO.stdin
> >  let k = read $ words line1 !! 1
> >  s <- SIO.hGetContents SIO.stdin
> >  print $ count s k
> >
> > count :: String -> Int -> Int
> > count s k = DLi.foldl' foldFunc 0 (map read $ words s)
> >  where
> >foldFunc :: Int -> Int -> Int
> >foldFunc a b
> >  | mod b k == 0  = a+1
> >  | otherwise = a
> >
> >
> > I tried using Data.ByteString but then found that 'read' needs a String,
> > not a ByteString.
> > I tried using buffered IO, but it did not make any difference.
> >
> > Any suggestions on how to speed it up?
> >
> > Regards,
> > Steve
> >
> > ___
> > Haskell-Cafe mailing list
> > Haskell-Cafe@haskell.org
> > http://www.haskell.org/mailman/listinfo/haskell-cafe
> >
> 
> 
> 

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: Re[2]: [Haskell-cafe] Slow IO?

2009-08-30 Thread Peter Verswyvelen

Well, Steve wrote:
> I compared the top 10 C/C++ results against the top 10 Haskell results:

So to me it seems he's not talking about his code.

Anyway, I thought Haskell's ByteString IO should not be that much slower
anyway.

Not sure how lazy ByteString IO is implemented, but if it performs async
(aka overlapped) IO, it could be very very fast (faster than C), since the
reading of the next buffer from (or writing of the previous buffer to) the
file is then completely parallel with the computation (when done inplace you
even don't need a memcpy, although these days the overhead of copying 64KB
of memory might be very tiny, it used to be different in the old days :-) At
least that's how I did it in the past in C++ with templates, which was
faster than the C approach.

On Sun, Aug 30, 2009 at 2:15 PM, Bulat Ziganshin
wrote:

> Hello Steve,
>
> Sunday, August 30, 2009, 3:54:53 PM, you wrote:
>
> > So it looks like Haskell is ~13 slower for IO than C/C++, even (I
> > assume) when using Data.ByteString or other speed-up tricks.
>
> it means that *your* program is 13x slower than C one and nothing
> more. in particular, your program may be constrained by readInt
> speed
>
>
> --
> Best regards,
>  Bulatmailto:bulat.zigans...@gmail.com
>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO?

2009-08-30 Thread Eugene Kirpichov

Here's my version that works in 0.7s for me for a file with 10^7
"9"'s but for some reason gets a 'wrong answer' at SPOJ :)

{-# LANGUAGE BangPatterns #-}
module Main where

import qualified Data.ByteString.Lazy as B
import Data.Word

answer :: Int -> B.ByteString -> Int
answer k = fst . B.foldl' f (0, 0)
  where f :: (Int,Int) -> Word8 -> (Int,Int)
f (!countSoFar, !x) 10
  | x`mod`k==0 = (countSoFar+1, 0)
  | otherwise  = (countSoFar,   0)
f (!countSoFar, !x) c = (countSoFar, 10*x+(fromIntegral c)-48)

readInt :: B.ByteString -> Int
readInt = B.foldl' (\x c -> 10*x+fromIntegral c-48) 0

main = do
  (line, rest) <- B.break (==10) `fmap` B.getContents
  let [n, k] = map readInt . B.split 32 $ line
  putStrLn . show $ answer k rest - 1


2009/8/30 Steve :
> On Sun, 2009-08-30 at 06:30 -0400, Gwern Branwen wrote:
>> On Sun, Aug 30, 2009 at 6:14 AM, Steve wrote:
>> > Hi,
>> > I'm tackling a Sphere Online Judge tutorial question where it tests how
>> > fast you can process input data. You need to achieve at least 2.5MB of
>> > input data per second at runtime (on an old machine running ghc 6.6.1).
>> > This is probably close to the limit of Haskell's ability.
>> >
>> > https://www.spoj.pl/problems/INTEST/
>> >
>> > I can see that 24 haskell programmers have solved it, but most are very
>> > close to the 8 secs limit (and 6/24 are even over the limit!).
>> >
>> > Here's my code. It fails with a "time limit exceeded" error. (I think it
>> > would calculate the correct result, eventually).
>> >
>> > module Main where
>> >
>> > import qualified Data.List as DLi
>> > import qualified System.IO as SIO
>> >
>> > main :: IO ()
>> > main = do
>> >  line1 <- SIO.hGetLine SIO.stdin
>> >  let k = read $ words line1 !! 1
>> >  s <- SIO.hGetContents SIO.stdin
>> >  print $ count s k
>> >
>> > count :: String -> Int -> Int
>> > count s k = DLi.foldl' foldFunc 0 (map read $ words s)
>> >  where
>> >    foldFunc :: Int -> Int -> Int
>> >    foldFunc a b
>> >      | mod b k == 0  = a+1
>> >      | otherwise     = a
>> >
>> >
>> > I tried using Data.ByteString but then found that 'read' needs a String,
>> > not a ByteString.
>> > I tried using buffered IO, but it did not make any difference.
>> >
>> > Any suggestions on how to speed it up?
>> >
>> > Regards,
>> > Steve
>>
>> Did you try readInt?
>> http://hackage.haskell.org/packages/archive/bytestring/0.9.1.4/doc/html/Data-ByteString-Char8.html#23
>>
>
> Thanks. I didn't see readInt. It allows me to use ByteString and produce
> results about 10 times faster than System.IO hGetContents. It makes me
> wonder why the System.IO functions have not been replaced by
> Data.ByteString.
>
> My program runs in 8.56 seconds (its over the 8 secs limit but it was
> accepted).
>
> I compared the top 10 C/C++ results against the top 10 Haskell results:
> C/C++   ~0.4 secs
> Haskell ~5.0 secs
> So it looks like Haskell is ~13 slower for IO than C/C++, even (I
> assume) when using Data.ByteString or other speed-up tricks.
>
> Steve
>
>
>
>
>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>



-- 
Eugene Kirpichov
Web IR developer, market.yandex.ru
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re[2]: [Haskell-cafe] Slow IO?

2009-08-30 Thread Bulat Ziganshin

Hello Steve,

Sunday, August 30, 2009, 3:54:53 PM, you wrote:

> So it looks like Haskell is ~13 slower for IO than C/C++, even (I
> assume) when using Data.ByteString or other speed-up tricks.

it means that *your* program is 13x slower than C one and nothing
more. in particular, your program may be constrained by readInt
speed


-- 
Best regards,
 Bulatmailto:bulat.zigans...@gmail.com

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO?

2009-08-30 Thread Steve

On Sun, 2009-08-30 at 06:30 -0400, Gwern Branwen wrote:
> On Sun, Aug 30, 2009 at 6:14 AM, Steve wrote:
> > Hi,
> > I'm tackling a Sphere Online Judge tutorial question where it tests how
> > fast you can process input data. You need to achieve at least 2.5MB of
> > input data per second at runtime (on an old machine running ghc 6.6.1).
> > This is probably close to the limit of Haskell's ability.
> >
> > https://www.spoj.pl/problems/INTEST/
> >
> > I can see that 24 haskell programmers have solved it, but most are very
> > close to the 8 secs limit (and 6/24 are even over the limit!).
> >
> > Here's my code. It fails with a "time limit exceeded" error. (I think it
> > would calculate the correct result, eventually).
> >
> > module Main where
> >
> > import qualified Data.List as DLi
> > import qualified System.IO as SIO
> >
> > main :: IO ()
> > main = do
> >  line1 <- SIO.hGetLine SIO.stdin
> >  let k = read $ words line1 !! 1
> >  s <- SIO.hGetContents SIO.stdin
> >  print $ count s k
> >
> > count :: String -> Int -> Int
> > count s k = DLi.foldl' foldFunc 0 (map read $ words s)
> >  where
> >foldFunc :: Int -> Int -> Int
> >foldFunc a b
> >  | mod b k == 0  = a+1
> >  | otherwise = a
> >
> >
> > I tried using Data.ByteString but then found that 'read' needs a String,
> > not a ByteString.
> > I tried using buffered IO, but it did not make any difference.
> >
> > Any suggestions on how to speed it up?
> >
> > Regards,
> > Steve
> 
> Did you try readInt?
> http://hackage.haskell.org/packages/archive/bytestring/0.9.1.4/doc/html/Data-ByteString-Char8.html#23
> 

Thanks. I didn't see readInt. It allows me to use ByteString and produce
results about 10 times faster than System.IO hGetContents. It makes me
wonder why the System.IO functions have not been replaced by
Data.ByteString.

My program runs in 8.56 seconds (its over the 8 secs limit but it was
accepted).

I compared the top 10 C/C++ results against the top 10 Haskell results:
C/C++   ~0.4 secs
Haskell ~5.0 secs
So it looks like Haskell is ~13 slower for IO than C/C++, even (I
assume) when using Data.ByteString or other speed-up tricks.

Steve





___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO?

2009-08-30 Thread Eugene Kirpichov

module Main where

import qualified Data.ByteString.Lazy as B
import Data.ByteString.Nums.Careless -- from bytestring-nums package

bint :: B.ByteString -> Int
bint = int

main = do
  line : rest <- B.split 10 `fmap` B.getContents
  let [n, k] = map int . B.split 32 $ line
  putStrLn . show . length . tail . filter ((==0).(`mod`k).bint)  $ rest

This does a 100MB file in 2.7s (probably because the file is cached by
the filesystem).

2009/8/30 Steve :
> Hi,
> I'm tackling a Sphere Online Judge tutorial question where it tests how
> fast you can process input data. You need to achieve at least 2.5MB of
> input data per second at runtime (on an old machine running ghc 6.6.1).
> This is probably close to the limit of Haskell's ability.
>
> https://www.spoj.pl/problems/INTEST/
>
> I can see that 24 haskell programmers have solved it, but most are very
> close to the 8 secs limit (and 6/24 are even over the limit!).
>
> Here's my code. It fails with a "time limit exceeded" error. (I think it
> would calculate the correct result, eventually).
>
> module Main where
>
> import qualified Data.List as DLi
> import qualified System.IO as SIO
>
> main :: IO ()
> main = do
>  line1 <- SIO.hGetLine SIO.stdin
>  let k = read $ words line1 !! 1
>  s <- SIO.hGetContents SIO.stdin
>  print $ count s k
>
> count :: String -> Int -> Int
> count s k = DLi.foldl' foldFunc 0 (map read $ words s)
>  where
>    foldFunc :: Int -> Int -> Int
>    foldFunc a b
>      | mod b k == 0  = a+1
>      | otherwise     = a
>
>
> I tried using Data.ByteString but then found that 'read' needs a String,
> not a ByteString.
> I tried using buffered IO, but it did not make any difference.
>
> Any suggestions on how to speed it up?
>
> Regards,
> Steve
>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>



-- 
Eugene Kirpichov
Web IR developer, market.yandex.ru
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO?

2009-08-30 Thread Gwern Branwen

On Sun, Aug 30, 2009 at 6:14 AM, Steve wrote:
> Hi,
> I'm tackling a Sphere Online Judge tutorial question where it tests how
> fast you can process input data. You need to achieve at least 2.5MB of
> input data per second at runtime (on an old machine running ghc 6.6.1).
> This is probably close to the limit of Haskell's ability.
>
> https://www.spoj.pl/problems/INTEST/
>
> I can see that 24 haskell programmers have solved it, but most are very
> close to the 8 secs limit (and 6/24 are even over the limit!).
>
> Here's my code. It fails with a "time limit exceeded" error. (I think it
> would calculate the correct result, eventually).
>
> module Main where
>
> import qualified Data.List as DLi
> import qualified System.IO as SIO
>
> main :: IO ()
> main = do
>  line1 <- SIO.hGetLine SIO.stdin
>  let k = read $ words line1 !! 1
>  s <- SIO.hGetContents SIO.stdin
>  print $ count s k
>
> count :: String -> Int -> Int
> count s k = DLi.foldl' foldFunc 0 (map read $ words s)
>  where
>    foldFunc :: Int -> Int -> Int
>    foldFunc a b
>      | mod b k == 0  = a+1
>      | otherwise     = a
>
>
> I tried using Data.ByteString but then found that 'read' needs a String,
> not a ByteString.
> I tried using buffered IO, but it did not make any difference.
>
> Any suggestions on how to speed it up?
>
> Regards,
> Steve

Did you try readInt?
http://hackage.haskell.org/packages/archive/bytestring/0.9.1.4/doc/html/Data-ByteString-Char8.html#23

-- 
gwern
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Slow IO?

2009-08-30 Thread Steve

Hi,
I'm tackling a Sphere Online Judge tutorial question where it tests how
fast you can process input data. You need to achieve at least 2.5MB of
input data per second at runtime (on an old machine running ghc 6.6.1).
This is probably close to the limit of Haskell's ability.

https://www.spoj.pl/problems/INTEST/

I can see that 24 haskell programmers have solved it, but most are very
close to the 8 secs limit (and 6/24 are even over the limit!).

Here's my code. It fails with a "time limit exceeded" error. (I think it
would calculate the correct result, eventually). 

module Main where

import qualified Data.List as DLi
import qualified System.IO as SIO

main :: IO ()
main = do
  line1 <- SIO.hGetLine SIO.stdin
  let k = read $ words line1 !! 1
  s <- SIO.hGetContents SIO.stdin
  print $ count s k

count :: String -> Int -> Int
count s k = DLi.foldl' foldFunc 0 (map read $ words s)
  where
foldFunc :: Int -> Int -> Int
foldFunc a b
  | mod b k == 0  = a+1
  | otherwise = a


I tried using Data.ByteString but then found that 'read' needs a String,
not a ByteString.
I tried using buffered IO, but it did not make any difference.

Any suggestions on how to speed it up?

Regards,
Steve

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO or bad code?

2007-08-09 Thread Vimal

On 8/9/07, Marc Weber <[EMAIL PROTECTED]> wrote:
> > I wrote a code, but seems to give "Time limit exceeded"!
> ??
> Your code writes
> 15 to stdout which is correct (with the example given on the page)..
> You have to explain what you mean by >>seems to give "Time limit exceeded"<<
>
> > loop t function
> Does already exist.
> sequence $ replicate 10 function
> is a much shorter way :-)
>
> oor perhaps mapM_ [ function | i <- [1..10] ] )
>
> prod, to_int:
> You can both implement using higher order functions
>
> prod = sum . zipWith (*)
> to_int = map read
>

Thanks :) Yes, I see no reason why the code should be rejected by the
judge (Time limit exceeded)  just because I had defined all the
functions. I had done this on many other occasions, and they all had
worked well.

I think that it has a lot to do with the (read) function and how it is
implemented. So parsing takes quite a bit of time, and eventually most
of the time gets used up in processing input.

But the new functions are wonderful :-) I had a hunch that these
functions should have been defined, but I learnt a lot in the process
of writing those functions again!

-- Vimal
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO or bad code?

2007-08-09 Thread Vimal

On 8/9/07, Brent Yorgey <[EMAIL PROTECTED]> wrote:
> On 8/9/07, Chaddaï Fouché <[EMAIL PROTECTED]> wrote:
> > I get "Wrong answer" with the following code for the same problem...
> > Is there something strange in this code :
>
> This problem description is not worded very well.  You have to figure out
> the matching that maximizes the sum of hotnesses; you don't necessarily just
> do a sum . zipWith (*).
>
Exactly.
The description says:
"Company XYZ has done the work for you, and now do xxx". This confused
me a lot :-)
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO or bad code?

2007-08-09 Thread Brent Yorgey

On 8/9/07, Chaddaï Fouché <[EMAIL PROTECTED]> wrote:
>
> I get "Wrong answer" with the following code for the same problem...
> Is there something strange in this code :

This problem description is not worded very well.  You have to figure out
the matching that maximizes the sum of hotnesses; you don't necessarily just
do a sum . zipWith (*).

-Brent
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO or bad code?

2007-08-09 Thread Chaddaï Fouché

Note that this code isn't more successful, clearly I have
misunderstood one requirement :

import qualified Data.ByteString.Char8 as B
import Data.List (unfoldr)

main = B.interact $ hot

hot = B.unlines . map (B.pack . show) . processList . tail . unfoldr readInt1

readInt1 cs = do
  (n, cs') <- B.readInt cs
  return (n, B.tail cs')

processList [] = []
processList (x:xs) = (sum $ zipWith (*) men women) : processList rest
where (men,r1) = splitAt x xs
  (women,rest) = splitAt x r1

-- 
Jedaï
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO or bad code?

2007-08-09 Thread Chaddaï Fouché

I get "Wrong answer" with the following code for the same problem...
Is there something strange in this code :


module Main where
import qualified Data.ByteString.Char8 as B

main =
B.getLine >>=
sequence_ . flip replicate hot . maybe 0 fst . B.readInt

hot = do
B.getLine
men <- B.getLine
women <- B.getLine
print
  $ sum
  $ zipWith (*)
(map (maybe 0 fst . B.readInt) $ B.words men)
(map (maybe 0 fst . B.readInt) $ B.words women)


???
I get the expected results with my tests.

-- 
Jedaï
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO or bad code?

2007-08-09 Thread Brent Yorgey

On 8/9/07, Marc Weber <[EMAIL PROTECTED]> wrote:
>
> > I wrote a code, but seems to give "Time limit exceeded"!
> ??
> Your code writes
> 15 to stdout which is correct (with the example given on the page)..
> You have to explain what you mean by >>seems to give "Time limit
> exceeded"<<
>

I think Vimal is referring to a message from SPOJ rather than the compiler.
I.e. the program runs too slowly so it is rejected by the judging software.

-Brent
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO or bad code?

2007-08-09 Thread Marc Weber

> I wrote a code, but seems to give "Time limit exceeded"!
??
Your code writes
15 to stdout which is correct (with the example given on the page)..
You have to explain what you mean by >>seems to give "Time limit exceeded"<<

> loop t function
Does already exist.
sequence $ replicate 10 function
is a much shorter way :-)

oor perhaps mapM_ [ function | i <- [1..10] ] )

prod, to_int:
You can both implement using higher order functions

prod = sum . zipWith (*)
to_int = map read

Marc
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO or bad code?

2007-08-09 Thread Vimal

@Donald:
Thanks for the link.

> prod = sum . zipWith (*)
>
> This is the slow part. Prelude.read ist really slow.
>
> Futhermore use the recusion pattern again:
> to_int = map read
>
> What is n used for?
@Lutz:
Those are some nice tricks... Thanks!
Now, the 'n' is for getting the number of numbers in the list. Which I
don't need, since I had a way around it. I just had to skip that
line...
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO or bad code?

2007-08-09 Thread Lutz Donnerhacke

* Vimal wrote:
>>> Beginning of CODE
> loop t function
>  | t == 1 = do function
>  | otherwise = do { function; loop (t - 1) function }
>
> prod [] [] = 0
> prod (a:as) (b:bs) = a*b + prod as bs

prod = sum . zipWith (*)

> to_int :: [String] -> [Integer]
> to_int [] = []
> to_int (x:xs) = (read x) : to_int xs

This is the slow part. Prelude.read ist really slow.

Futhermore use the recusion pattern again:
to_int = map read

> doit = do
>   n <- getLine
>   a <- getLine
>   b <- getLine
>   let la = to_int (words a);
>   lb = to_int (words b); in
> print (prod la lb)

What is n used for?
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO or bad code?

2007-08-09 Thread Donald Bruce Stewart

j.vimal:
> Hi
> I am practicing writing code in haskell, by solving problems at this
> site. http://spoj.pl.
> The problem http://spoj.pl/problems/FASHION , is pretty simple.
> 
> 1. Given two lists A,B , of N numbers, sort them and take sum of products.
> i.e. Sum ai * bi
> 
> I wrote a code, but seems to give "Time limit exceeded"!

We have a page for these SPOJ problems:


http://haskell.org/haskellwiki/SPOJ#Techniques_for_dealing_with_problems_efficiently

With bytestring IO, you can aim to be around the speed of OCaml or C++,
according to the existing bytestring entries.

-- Don
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Slow IO or bad code?

2007-08-09 Thread Vimal

Hi
I am practicing writing code in haskell, by solving problems at this
site. http://spoj.pl.
The problem http://spoj.pl/problems/FASHION , is pretty simple.

1. Given two lists A,B , of N numbers, sort them and take sum of products.
i.e. Sum ai * bi

I wrote a code, but seems to give "Time limit exceeded"!

>> Beginning of CODE
loop t function
  | t == 1 = do function
  | otherwise = do { function; loop (t - 1) function }

prod [] [] = 0
prod (a:as) (b:bs) = a*b + prod as bs

to_int :: [String] -> [Integer]
to_int [] = []
to_int (x:xs) = (read x) : to_int xs

doit = do
  n <- getLine
  a <- getLine
  b <- getLine
  let la = to_int (words a);
  lb = to_int (words b); in
print (prod la lb)

main = do
  t <- getLine
  loop (read t) doit
<< END OF CODE

I would love to see if there is any improvement that can be done, to
the code ...

Thanks!

Vimal
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO

2006-09-14 Thread Ketil Malde

Daniel Fischer <[EMAIL PROTECTED]> writes:

> Yes, I did it wrong, but I didn't keep anything (but the first and last Char 
> of each line) in memory on purpose. I hoped for the lines to be read one 
> after the other, head and last extracted 
   [...]
> Profiling (hy,hc) shows that the IO part of the programme holds on to tons of 
> lists - that couldn't be anything but parts of the file-contents, I believe.

Chances are that you don't evaluate these characters right away, so
what you are retaining is just unevaluated thunks that refer to the
lines themselves.  So the fix is to make the extraction of the
characters stricter.

-k
-- 
If I haven't seen further, it is by standing in the footprints of giants

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO

2006-09-13 Thread Daniel Fischer

Am Mittwoch, 13. September 2006 11:07 schrieben Sie:
> Daniel Fischer wrote:
> > > Most certainly not.  I'm pretty sure this is to a bug in your code.
> > > Something retains a data structure which is actually unneeded. 
> > > Probably
> >
> > Apparently. And my money is on a load of lines from the file (of which I
> > need only the first and last Char).
>
> Then you're doing it wrong[TM].  You shouldn't need to keep any part of

Yes, I did it wrong, but I didn't keep anything (but the first and last Char 
of each line) in memory on purpose. I hoped for the lines to be read one 
after the other, head and last extracted - possibly immediately passed to 
accumArray, but I wouldn't necessarily expect that - and the already used 
lines thrown in the dustbin on next GC. Maybe the compiler couldn't figure 
out that it wouldn't access these lines anymore.

> the input in memory.  Whatever it is, nobody can tell you without seeing
> the code.  Try heap profiling, should you have no idea where to look for
> leaks.

Profiling (hy,hc) shows that the IO part of the programme holds on to tons of 
lists - that couldn't be anything but parts of the file-contents, I believe.
>
> > How could I solve the problem without representing the graph in some way?
>
> By using an advanced tool called "brains".  Sorry for not being more
> specific, but that's actually the fun part of the challenge and I'm not
> going to spoil it for you.  ;-)
>
> > Forgive the stupid question, but where if not RAM would the chunk
> > currently processed reside?
>
> Oh, I overlooked "chunk".  Well, yes, the "chunk" currently processed
> needs to fit into RAM.  But how much of a problem could a single Char
> pose?

Well, if it's the last straw...
But not much, I presume and even though it might be that we must have a few 
thousand Chars inmemory, that shouldn't do much harm either.

>
> Donald Bruce Stewart wrote:
> > I agree. Some problems simply require you to hold large strings in
> > memory. And for those, [Char] conks out around 5-10M (try reversing a
> > 10M [Char]).
>
> Sure, this one just isn't of that kind.

Yes, but I didn't tell the compiler :-(

>
>
> Udo.

Cheers,
Daniel

-- 

"In My Egotistical Opinion, most people's C programs should be
indented six feet downward and covered with dirt."
-- Blair P. Houghton

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re[2]: [Haskell-cafe] Slow IO

2006-09-13 Thread Bulat Ziganshin

Hello Ketil,

Wednesday, September 13, 2006, 10:41:13 AM, you wrote:

> But a String is something like 8 or 12 bytes per character, a
> ByteString gets you down to 1.

12-16. Char itself, pointer to the next list element, and two boxes
around them - this count for 16 bytes on 32-bit CPU. but cells with
small Char are preallocated at program startup, so it will be 12 bytes
for ascii-only strings

but that is not the whole story :)  copying GC makes program's memory
usage 3 times larger, on average, than it really allocates while
compacting GC has only 2x overhead

ByteStrings are allocated in _unmovable_ part of GHC heap, so they
don't suffer from this problem. of course, it is not free -
program that creates and destroys large number of ByteStrings will
suffer from memory holes, which is right the problem solved by ghc's GC

so, for program that only allocates the difference may be 24/36 times
on average while for create-use-destroy-loop scenarios i can't make
any detailed prognoses

also, traversing those lengthy lists on GCs is very time-consuming

-- 
Best regards,
 Bulatmailto:[EMAIL PROTECTED]

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO

2006-09-13 Thread Udo Stenzel

Ketil Malde wrote:
> Daniel Fischer <[EMAIL PROTECTED]> writes:
> 
> > Maybe I've misused the word segfault.
> 
> I think so.  A segfault is the operating-system complaining about an
> illegal memory access.  If you get them from Haskell, it is likely a
> bug in the compiler or run-time system (or you were using unsafeAt, or
> FFI). 

Far simpler:  This is really a segfault, and it's because of a
misfeature of Linux called "memory overcommitment".  When physical
memory runs out, Linux happily hands out more to applications requesting
it, in the vain hope that at least some of it is never accessed.
Therefore, malloc() is always successful, but when the memory is finally
accessed, it suddenly turns out that there isn't anything to access,
which results in a segfault.  No amount of error checking can prevent
that and it could have hit any process allocating memory when it ran
out.

Sane people turn overcommitment off.  Sane people wouldn't have
implemented it in the first place, either.

Udo.
-- 
The reasonable man adapts himself to the world; the unreasonable one
persists in trying to adapt the world to himself. Therefore all progress
depends on the unreasonable man. 
-- George Bernard Shaw

signature.asc
Description: Digital signature
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO

2006-09-13 Thread Udo Stenzel

Daniel Fischer wrote:
> > Most certainly not.  I'm pretty sure this is to a bug in your code.
> > Something retains a data structure which is actually unneeded.  Probably
> 
> Apparently. And my money is on a load of lines from the file (of which I need 
> only the first and last Char).

Then you're doing it wrong[TM].  You shouldn't need to keep any part of
the input in memory.  Whatever it is, nobody can tell you without seeing
the code.  Try heap profiling, should you have no idea where to look for
leaks.

> How could I solve the problem without representing the graph in some way?

By using an advanced tool called "brains".  Sorry for not being more
specific, but that's actually the fun part of the challenge and I'm not
going to spoil it for you.  ;-)

> Forgive the stupid question, but where if not RAM would the chunk currently 
> processed reside?

Oh, I overlooked "chunk".  Well, yes, the "chunk" currently processed
needs to fit into RAM.  But how much of a problem could a single Char
pose?

Donald Bruce Stewart wrote:
> I agree. Some problems simply require you to hold large strings in
> memory. And for those, [Char] conks out around 5-10M (try reversing a
> 10M [Char]).

Sure, this one just isn't of that kind.

Udo.
-- 
"Irrationality is the square root of all evil"
-- Douglas Hofstadter

signature.asc
Description: Digital signature
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO

2006-09-12 Thread Ketil Malde

Daniel Fischer <[EMAIL PROTECTED]> writes:

> Maybe I've misused the word segfault.

I think so.  A segfault is the operating-system complaining about an
illegal memory access.  If you get them from Haskell, it is likely a
bug in the compiler or run-time system (or you were using unsafeAt, or
FFI). 

> The programme consumed more and more memory (according to top),
> kswapd started to have a higher CPU-percentage than my programme,
> programme died, system yelling 'Speicherzugriffsfehler', top displays 
> 'kswapd'.

I find that swapping the GHC heap is not productive, so I tend to
limit memory to available RAM.  You can do that either by limiting
available memory to process in the shell (ulimit or limit), or by
specifying RTS options to the haskell program (typically +RTS -Mxxx,
where xxx is 80-90% of physical RAM).

>From the GHC docs (6.4.2):

   -Msize

[Default: unlimited] Set the maximum heap size to size bytes. The

I thought the default was set according to limits? 

heap normally grows and shrinks according to the memory
requirements of the program. The only reason for having this
option is to stop the heap growing without bound and filling up
all the available swap space, which at the least will result in
the program being summarily killed by the operating system. 

In my experience, a program which thrashes severely without heap
limits can run fine if you just limit the heap.  So it's not as much
an issue of 'filling up swap' vs. 'killed by the OS', but 'takes
forever, making the system unresponsive in the process' vs. 'tries
hard to complete in a timely fashion, or halts with an error'.  I much
prefer the latter.

> However the problem might be too little lazyness, because if I explicitly 
> read 
> the file line by line, memory usage remains low enough -- but ByteString is 
> *much* faster anyway.

Right.  You should still try to consume the file lazily, and make sure
the data can get discarded and GC'ed when you have no further use for
it.  But a String is something like 8 or 12 bytes per character, a
ByteString gets you down to 1.

-k
-- 
If I haven't seen further, it is by standing in the footprints of giants

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO

2006-09-12 Thread Donald Bruce Stewart

daniel.is.fischer:
> Am Dienstag, 12. September 2006 22:26 schrieben Sie:
> > Daniel Fischer wrote:
> > > The programme consumed more and more memory (according to top),
> > > kswapd started to have a higher CPU-percentage than my programme,
> > > programme died, system yelling 'Speicherzugriffsfehler', top displays
> > > 'kswapd'.
> > > I believe that means my programme demanded more memory than I have
> > > available (only 256MB RAM + 800MB swap). Is that a segfault or what is
> > > the correct term?
> > >
> > > That is probably due to (apart from the stupidity of my IO-code) the
> > > large overhead of Haskell lists.
> >
> > Most certainly not.  I'm pretty sure this is to a bug in your code.
> > Something retains a data structure which is actually unneeded.  Probably
> 
> Apparently. And my money is on a load of lines from the file (of which I need 
> only the first and last Char).
> 
> > a case of "foldl" where "foldl'" should be used or a "try" in Parsec
> > code where it should be left out or a lot of "updateWiths" to a Map,
> > etc.  Or it could be a bad choice of data structure.  I bet, it's the
> > map you're using to represent the graph (which you don't even need to
> > represent at all, btw).
> 
> No foldl nor parsec around. I represent the graph as a
> 
> UArray (Char,Char) Int 
> 
> (I've switched to Int for the index type, too, when tuning the code), so that 
> shouldn't use much memory (array size is 676).
> The array is built via accumArray, I hope that's sufficiently efficient
> (though now I use unsafeAccumArrayUArray, that's faster).
> 
> How could I solve the problem without representing the graph in some way?
> Possibly that could be done more efficiently than I do it, but I can't 
> imagine 
> how to do it without representing the graph in some data structure.
> >
> > > So the chunk of the file which easily fits into my
> > > RAM in ByteString form is too large as a list of ordinary Strings.
> >
> > The chunk of file should never need to fit into RAM.  If that's a
> > problem, you also forgot to prime a crucial "foldl".
> >
> 
> Forgive the stupid question, but where if not RAM would the chunk currently 
> processed reside?

I agree. Some problems simply require you to hold large strings in
memory. And for those, [Char] conks out around 5-10M (try reversing a
10M [Char]).

-- Don
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO

2006-09-12 Thread Daniel Fischer

Am Dienstag, 12. September 2006 22:26 schrieben Sie:
> Daniel Fischer wrote:
> > The programme consumed more and more memory (according to top),
> > kswapd started to have a higher CPU-percentage than my programme,
> > programme died, system yelling 'Speicherzugriffsfehler', top displays
> > 'kswapd'.
> > I believe that means my programme demanded more memory than I have
> > available (only 256MB RAM + 800MB swap). Is that a segfault or what is
> > the correct term?
> >
> > That is probably due to (apart from the stupidity of my IO-code) the
> > large overhead of Haskell lists.
>
> Most certainly not.  I'm pretty sure this is to a bug in your code.
> Something retains a data structure which is actually unneeded.  Probably

Apparently. And my money is on a load of lines from the file (of which I need 
only the first and last Char).

> a case of "foldl" where "foldl'" should be used or a "try" in Parsec
> code where it should be left out or a lot of "updateWiths" to a Map,
> etc.  Or it could be a bad choice of data structure.  I bet, it's the
> map you're using to represent the graph (which you don't even need to
> represent at all, btw).

No foldl nor parsec around. I represent the graph as a

UArray (Char,Char) Int 

(I've switched to Int for the index type, too, when tuning the code), so that 
shouldn't use much memory (array size is 676).
The array is built via accumArray, I hope that's sufficiently efficient
(though now I use unsafeAccumArrayUArray, that's faster).

How could I solve the problem without representing the graph in some way?
Possibly that could be done more efficiently than I do it, but I can't imagine 
how to do it without representing the graph in some data structure.
>
> > So the chunk of the file which easily fits into my
> > RAM in ByteString form is too large as a list of ordinary Strings.
>
> The chunk of file should never need to fit into RAM.  If that's a
> problem, you also forgot to prime a crucial "foldl".
>

Forgive the stupid question, but where if not RAM would the chunk currently 
processed reside?

>
> Udo.

Cheers,
Daniel

-- 

"In My Egotistical Opinion, most people's C programs should be
indented six feet downward and covered with dirt."
-- Blair P. Houghton

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO

2006-09-12 Thread Udo Stenzel

Daniel Fischer wrote:
> The programme consumed more and more memory (according to top),
> kswapd started to have a higher CPU-percentage than my programme,
> programme died, system yelling 'Speicherzugriffsfehler', top displays 
> 'kswapd'.
> I believe that means my programme demanded more memory than I have available 
> (only 256MB RAM + 800MB swap). Is that a segfault or what is the correct 
> term?
> 
> That is probably due to (apart from the stupidity of my IO-code) the large 
> overhead of Haskell lists.

Most certainly not.  I'm pretty sure this is to a bug in your code.
Something retains a data structure which is actually unneeded.  Probably
a case of "foldl" where "foldl'" should be used or a "try" in Parsec
code where it should be left out or a lot of "updateWiths" to a Map,
etc.  Or it could be a bad choice of data structure.  I bet, it's the
map you're using to represent the graph (which you don't even need to
represent at all, btw).

> So the chunk of the file which easily fits into my 
> RAM in ByteString form is too large as a list of ordinary Strings.

The chunk of file should never need to fit into RAM.  If that's a
problem, you also forgot to prime a crucial "foldl".

Udo.
-- 
"Proof by analogy is fraud." -- Bjarne Stroustrup

signature.asc
Description: Digital signature
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO

2006-09-12 Thread Jared Updike


Maybe I've misused the word segfault.
What happened is:
The programme consumed more and more memory (according to top),
kswapd started to have a higher CPU-percentage than my programme,
programme died, system yelling 'Speicherzugriffsfehler', top displays
'kswapd'.
I believe that means my programme demanded more memory than I have available
(only 256MB RAM + 800MB swap). Is that a segfault or what is the correct
term?


I thought a segfault was when a program reads or writes to protected
memory (for example, out of bounds on an array, or dereferencing a
null pointer, etc. things mostly avoided in Haskell). Luckily there
are lots of fun ways to crash computers! I've heard your "heap
overflow" called "exploding" because of how the process rapdily
expands its memory usage, filling all available space, causing system
collapse.

 Jared.
--
http://www.updike.org/~jared/
reverse ")-:"
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO

2006-09-12 Thread Daniel Fischer

Am Montag, 11. September 2006 18:22 schrieben Sie:
> On 9/11/06, Daniel Fischer <[EMAIL PROTECTED]> wrote:
> > The problem spec states that the input file contains about 500 test
> > cases, each given by between 1 and 100,000 lines, each line containing a
> > single word of between 2 and 1000 letters.
> > So the file should be about 12.5G on average.
>
> I don't think that that necessarily follows. Although I've never seen
Not necessarily of course. But as the problem contains a warning about large 
input/output data, I assumed the worst reasonable case (uniform 
distribution).

> the input file, of course, I imagine that many cases are fairly small,
> but designed to test the accuracy of your algorithm. A few are large
> (in one way or another) to test the extremes of your algorithm. But
> the overall size of the input file is probably much, much smaller than
> that estimate. (Maybe 1MB? Maybe 10MB?)

That'd be peanuts with ByteString, 1MB even without.
>
> > A time limit of 7s is given.
>
> That's CPU time, and thus not including I/O, right? I couldn't find

Doesn't IO use the CPU? But seriously, I would have thought that's 
what 'time' lists under user and that includes IO-time as far as I can tell.

> the answer on the SPOJ site. Their FAQ is a forum, but I don't see it
> there:
>
>
> http://www.spoj.pl/forum/viewforum.php?f=6&sid=6c8fb9c3216c3abd1e720f8b4b56
>82b3
>
> In any case, the problem can't be too large; the top twenty programs
> all finished in under 0.35 (CPU seconds?). Even if yours is a tenth as
> fast as the C and C++ programs, that would be 3.5s -- twice as fast as
> it needs to be.
>
> http://www.spoj.pl/ranks/WORDS1/
>
> Of course, you don't have access to these other programs for
> comparison; but I hope that this gives you a better idea of the size
> (and manageability) of the task.
>
> Good luck with it!
>
> --Tom Phoenix

-- 

"In My Egotistical Opinion, most people's C programs should be
indented six feet downward and covered with dirt."
-- Blair P. Houghton

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO

2006-09-12 Thread Daniel Fischer

Am Montag, 11. September 2006 17:44 schrieben Sie:
> Daniel Fischer wrote:
> > >   Try Don Stewart's ByteString library
> >
> > -- and some data that made the programme segfault before now run in a
> > couple of seconds.
>
> But that shouldn't happen!  Segfaults aren't performance problems, but
> errors.  So your program contained a bug (lazyness where it isn't
> useful, I suspect), and ByString is hiding it.

Maybe I've misused the word segfault.
What happened is:
The programme consumed more and more memory (according to top),
kswapd started to have a higher CPU-percentage than my programme,
programme died, system yelling 'Speicherzugriffsfehler', top displays 
'kswapd'.
I believe that means my programme demanded more memory than I have available 
(only 256MB RAM + 800MB swap). Is that a segfault or what is the correct 
term?

That is probably due to (apart from the stupidity of my IO-code) the large 
overhead of Haskell lists. So the chunk of the file which easily fits into my 
RAM in ByteString form is too large as a list of ordinary Strings.
However the problem might be too little lazyness, because if I explicitly read 
the file line by line, memory usage remains low enough -- but ByteString is 
*much* faster anyway.

>
> > So even if we just counted newlines, we would have to scan 1,700 million
> > (1.7*10^9) chars per second.
> > Could any ordinary computer of today do that, using whatever language?
>
> That rate should be no problem for a 2GHz machine.  However, a 200MHz 64
> bit wide bus won't deliver the data fast enough and it is 50x as much as
> the best hard disks could deliver in optimal circumstances.  I guess,
> most of the test cases are a lot smaller than your guesstimate.
>

I suppose so, too, but not knowing the test data, I assumed a bad case 
(uniform distribution of line-lengths and test-case-size in the specified 
range).

>
> Udo.

Bulat:
> are you mean arithmetic or geometric average? ;)
I meant 'expected value'.
If X_i are independent random variables uniformly distributed on [0 .. k],
Y is a random variable (independent from the X_i) uniformly distributed on
[1 .. n] and Z is the sum of the first Y of the X_i, then the expected value 
of Z is (n+1)*k/4.
So we might call that a weighted arithmetic average, I suppose.

Cheers,
Daniel

-- 

"In My Egotistical Opinion, most people's C programs should be
indented six feet downward and covered with dirt."
-- Blair P. Houghton

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: Re: [Haskell-cafe] Slow IO

2006-09-11 Thread Tom Phoenix


On 9/11/06, Daniel Fischer <[EMAIL PROTECTED]> wrote:


The problem spec states that the input file contains about 500 test cases,
each given by between 1 and 100,000 lines, each line containing a single word
of between 2 and 1000 letters.
So the file should be about 12.5G on average.


I don't think that that necessarily follows. Although I've never seen
the input file, of course, I imagine that many cases are fairly small,
but designed to test the accuracy of your algorithm. A few are large
(in one way or another) to test the extremes of your algorithm. But
the overall size of the input file is probably much, much smaller than
that estimate. (Maybe 1MB? Maybe 10MB?)


A time limit of 7s is given.


That's CPU time, and thus not including I/O, right? I couldn't find
the answer on the SPOJ site. Their FAQ is a forum, but I don't see it
there:

   
http://www.spoj.pl/forum/viewforum.php?f=6&sid=6c8fb9c3216c3abd1e720f8b4b5682b3

In any case, the problem can't be too large; the top twenty programs
all finished in under 0.35 (CPU seconds?). Even if yours is a tenth as
fast as the C and C++ programs, that would be 3.5s -- twice as fast as
it needs to be.

   http://www.spoj.pl/ranks/WORDS1/

Of course, you don't have access to these other programs for
comparison; but I hope that this gives you a better idea of the size
(and manageability) of the task.

Good luck with it!

--Tom Phoenix
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re[2]: [Haskell-cafe] Slow IO

2006-09-11 Thread Bulat Ziganshin

Hello Daniel,

Monday, September 11, 2006, 6:05:38 PM, you wrote:

> The problem spec states that the input file contains about 500 test cases,
> each given by between 1 and 100,000 lines, each line containing a single word
> of between 2 and 1000 letters.
> So the file should be about 12.5G on average.

are you mean arithmetic or geometric average? ;)



-- 
Best regards,
 Bulatmailto:[EMAIL PROTECTED]

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO

2006-09-11 Thread Daniel Fischer

Am Sonntag, 10. September 2006 02:29 schrieben Sie:
> Hello,
>
>   Try Don Stewart's ByteString library
> (http://www.cse.unsw.edu.au/~dons/fps.html). It is much faster than
> the standard Haskell IO and now has lazy.
>
> -Jeff

Yay, that's really an improvement!
Depending on the size of the file/graph data, using ByteString.Lazy.Char8 
reduces the run time by ranging from 'just a little' to a factor of over 30 
-- and some data that made the programme segfault before now run in a couple 
of seconds.
Thanks Bryan, Simon, David, Don & Duncan.

However, the programme still spends the overwhelming part of the time doing 
IO, and I've started thinking about it. 
The problem spec states that the input file contains about 500 test cases, 
each given by between 1 and 100,000 lines, each line containing a single word 
of between 2 and 1000 letters.
So the file should be about 12.5G on average.
A time limit of 7s is given. 
So even if we just counted newlines, we would have to scan 1,700 million 
(1.7*10^9) chars per second. 
Could any ordinary computer of today do that, using whatever language?

Cheers,
Daniel

>
> On 9/9/06, Daniel Fischer <[EMAIL PROTECTED]> wrote:
> > Hello all,
> > Now I have an IO-problem, too.
> > SPOJ problem 41 asks basically to determine whether a directed graph
> > allows a path that uses every edge exactly once. The data from which the
> > graphs are to be constructed are contained in a (huge) file, every item
> > (number of test cases, size of test case, edges) on a single line. I've
> > created my own test case file (much smaller, but already 217 MB, 2.2
> > million lines) to check my programme's performance. Now profiling reveals
> > that over 90% of time and allocation are consumed by reading the file
> > (line by line, reading the whole file as a lazy String and then chopping
> > that to pieces is rather worse).
> >
> > So how do I quickly
> > 1. read an Integer n from the file,
> > 2. read the next n lines and
> > 3. pass the first and last letter of those to the function that builds
> > and checks the graph?
> >
> > When that is achieved, I could see whether my algorithm is good.
> >
> > Thanks for any help,
> > Daniel
> > --
> >

-- 

"In My Egotistical Opinion, most people's C programs should be
indented six feet downward and covered with dirt."
-- Blair P. Houghton

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Slow IO

2006-09-09 Thread jeff p


Hello,

 Try Don Stewart's ByteString library
(http://www.cse.unsw.edu.au/~dons/fps.html). It is much faster than
the standard Haskell IO and now has lazy.

-Jeff

On 9/9/06, Daniel Fischer <[EMAIL PROTECTED]> wrote:

Hello all,
Now I have an IO-problem, too.
SPOJ problem 41 asks basically to determine whether a directed graph allows a
path that uses every edge exactly once. The data from which the graphs are to
be constructed are contained in a (huge) file, every item (number of test
cases, size of test case, edges) on a single line. I've created my own test
case file (much smaller, but already 217 MB, 2.2 million lines) to check my
programme's performance. Now profiling reveals that over 90% of time and
allocation are consumed by reading the file (line by line, reading the whole
file as a lazy String and then chopping that to pieces is rather worse).

So how do I quickly
1. read an Integer n from the file,
2. read the next n lines and
3. pass the first and last letter of those to the function that builds and
checks the graph?

When that is achieved, I could see whether my algorithm is good.

Thanks for any help,
Daniel
--

"In My Egotistical Opinion, most people's C programs should be
indented six feet downward and covered with dirt."
-- Blair P. Houghton

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Slow IO

2006-09-09 Thread Daniel Fischer

Hello all,
Now I have an IO-problem, too.
SPOJ problem 41 asks basically to determine whether a directed graph allows a 
path that uses every edge exactly once. The data from which the graphs are to 
be constructed are contained in a (huge) file, every item (number of test 
cases, size of test case, edges) on a single line. I've created my own test 
case file (much smaller, but already 217 MB, 2.2 million lines) to check my 
programme's performance. Now profiling reveals that over 90% of time and 
allocation are consumed by reading the file (line by line, reading the whole 
file as a lazy String and then chopping that to pieces is rather worse).

So how do I quickly 
1. read an Integer n from the file,
2. read the next n lines and
3. pass the first and last letter of those to the function that builds and 
checks the graph?

When that is achieved, I could see whether my algorithm is good.

Thanks for any help,
Daniel
-- 

"In My Egotistical Opinion, most people's C programs should be
indented six feet downward and covered with dirt."
-- Blair P. Houghton

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

53 matches

Mail list logo