subject:"\[Haskell\-cafe\] Memory"

Re: [Haskell-cafe] Memory corruption issues when using newAlignedPinnedByteArray, GC kicking in?

2012-07-10 Thread Thomas Schilling

I think you should ask this question on the glasgow-haskell-users
mailing list: http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

On 10 July 2012 18:20, Nicolas Trangez  wrote:
> All,
>
> While working on my vector-simd library, I noticed somehow memory I'm
> using gets corrupted/overwritten. I reworked this into a test case, and
> would love to get some help on how to fix this.
>
> Previously I used some custom FFI calls to C to allocate aligned memory,
> which yields correct results, but this has a significant (+- 10x)
> performance impact on my benchmarks. Later on I discovered the
> newAlignedPinnedByteArray# function, and wrote some code using this.
>
> Here's what I did in the test case: I created an MVector instance, with
> the exact same implementation as vector's
> Data.Vector.Storable.Mutable.MVector instance, except for basicUnsafeNew
> where I pass one more argument to mallocVector [1].
>
> I also use 3 different versions of mallocVector (depending on
> compile-time flags):
>
> mallocVectorOrig [2]: This is the upstream version, discarding the
> integer argument I added.
>
> Then here's my first attempt, very similar to the implementation of
> mallocPlainForeignPtrBytes [3] at [4] using GHC.* libraries.
>
> Finally there's something similar at [5] which uses the 'primitive'
> library.
>
> The test case creates vectors of increasing size, then checks whether
> they contain the expected values. For the default implementation this
> works correctly. For both others it fails at some random size, and the
> values stored in the vector are not exactly what they should be.
>
> I don't understand what's going on here. I suspect I lack a reference
> (or something along those lines) so GC kicks in, or maybe the buffer
> gets relocated, whilst it shouldn't.
>
> Basically I'd need something like
>
> GHC.ForeignPtr.mallocPlainAlignedForeignPtrBytes :: Int -> Int -> IO
> (ForeignPtr a)
>
> Thanks,
>
> Nicolas
>
> [1] https://gist.github.com/3084806#LC37
> [2] https://gist.github.com/3084806#LC119
> [3]
> http://hackage.haskell.org/packages/archive/base/latest/doc/html/src/GHC-ForeignPtr.html
> [4] https://gist.github.com/3084806#LC100
> [5] https://gist.github.com/3084806#LC81
>
>
>
>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe



-- 
Push the envelope. Watch it bend.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Memory corruption issues when using newAlignedPinnedByteArray, GC kicking in?

2012-07-10 Thread Nicolas Trangez

All,

While working on my vector-simd library, I noticed somehow memory I'm
using gets corrupted/overwritten. I reworked this into a test case, and
would love to get some help on how to fix this.

Previously I used some custom FFI calls to C to allocate aligned memory,
which yields correct results, but this has a significant (+- 10x)
performance impact on my benchmarks. Later on I discovered the
newAlignedPinnedByteArray# function, and wrote some code using this.

Here's what I did in the test case: I created an MVector instance, with
the exact same implementation as vector's
Data.Vector.Storable.Mutable.MVector instance, except for basicUnsafeNew
where I pass one more argument to mallocVector [1].

I also use 3 different versions of mallocVector (depending on
compile-time flags):

mallocVectorOrig [2]: This is the upstream version, discarding the
integer argument I added.

Then here's my first attempt, very similar to the implementation of
mallocPlainForeignPtrBytes [3] at [4] using GHC.* libraries.

Finally there's something similar at [5] which uses the 'primitive'
library.

The test case creates vectors of increasing size, then checks whether
they contain the expected values. For the default implementation this
works correctly. For both others it fails at some random size, and the
values stored in the vector are not exactly what they should be.

I don't understand what's going on here. I suspect I lack a reference
(or something along those lines) so GC kicks in, or maybe the buffer
gets relocated, whilst it shouldn't.

Basically I'd need something like

GHC.ForeignPtr.mallocPlainAlignedForeignPtrBytes :: Int -> Int -> IO
(ForeignPtr a)

Thanks,

Nicolas

[1] https://gist.github.com/3084806#LC37
[2] https://gist.github.com/3084806#LC119
[3]
http://hackage.haskell.org/packages/archive/base/latest/doc/html/src/GHC-ForeignPtr.html
[4] https://gist.github.com/3084806#LC100
[5] https://gist.github.com/3084806#LC81




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Memory leak from close fields in conduit

2012-04-11 Thread Michael Snoyman

Hi all,

Hideyuki Tanaka alerted me[1] to a memory leak in conduit. Long story
short: it appears that Pipe composition leads to collection of a large
number of `return ()` actions for unnecessary memory cleanup. We came
up with a possible solution: a Finalize type[2]. In both of our
testing, this eliminates the memory leak in question.

So now I have a few questions:

1. Does anyone have a different idea on how to approach this problem?
2. Are there any refinements to be applied to the solution?
3. Technically, this is an API breakage, but (a) conduit 0.4 was
released very recently, (b) this is a fairly important change, and (c)
the real-world breakage from this is very minimal. Will anyone call
foul if I release such a change as 0.4.1?

Just to stress 3c: I compiled all of Yesod against this modified
version of conduit, and only http-conduit had to be tweaked at all.

Michael

[1] https://github.com/snoyberg/conduit/issues/40
[2] 
https://github.com/snoyberg/conduit/commit/ebc7a68e0f1f0c1b80c0105916b00060207944dd#L0R37

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory usage with TVar?

2012-02-01 Thread Jake McArthur

First, see this question about space usage on Stack Overflow:

http://stackoverflow.com/questions/3254758/memory-footprint-of-haskell-data-types

Next, apply this knowledge not only to Ints, but also to tuples and
lists. There's your memory usage.

- Jake

On Wed, Feb 1, 2012 at 10:29 AM, Johan Brinch  wrote:
> Here's the example program:
> https://gist.github.com/1cbe113d2c79e2fc9d2b
>
> When I run the program (which maintains a list inside an STM TVar), I
> get the following statistics:
>> ./Test +RTS -s
>     176,041,728 bytes allocated in the heap
>     386,794,976 bytes copied during GC
>      69,180,224 bytes maximum residency (7 sample(s))
>      42,651,080 bytes maximum slop
>             179 MB total memory in use (0 MB lost due to fragmentation)
>
>                                    Tot time (elapsed)  Avg pause  Max pause
>  Gen  0       336 colls,     0 par    0.44s    0.44s     0.0013s    0.0033s
>  Gen  1         7 colls,     0 par    0.39s    0.40s     0.0570s    0.1968s
>
>  INIT    time    0.00s  (  0.00s elapsed)
>  MUT     time    0.23s  (  0.23s elapsed)
>  GC      time    0.83s  (  0.84s elapsed)
>  EXIT    time    0.00s  (  0.00s elapsed)
>  Total   time    1.06s  (  1.07s elapsed)
>
>  %GC     time      77.9%  (78.3% elapsed)
>
>  Alloc rate    749,153,093 bytes per MUT second
>
>  Productivity  22.1% of total user, 22.0% of total elapsed
>
>
> How come this program uses 179 MB of memory? I'm on a 64-bit machine
> where 2'000'000 integers uses 32 MB. Where does the overhead come
> from?
>
> --
> Johan Brinch
>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory usage with TVar?

2012-02-01 Thread Johan Brinch

On Wed, Feb 1, 2012 at 16:29, Johan Brinch  wrote:
> I'm on a 64-bit machine
> where 2'000'000 integers uses 32 MB.

2 * 2'000'000 ;)

-- 
Johan Brinch

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Memory usage with TVar?

2012-02-01 Thread Johan Brinch

Here's the example program:
https://gist.github.com/1cbe113d2c79e2fc9d2b

When I run the program (which maintains a list inside an STM TVar), I
get the following statistics:
> ./Test +RTS -s
 176,041,728 bytes allocated in the heap
 386,794,976 bytes copied during GC
  69,180,224 bytes maximum residency (7 sample(s))
  42,651,080 bytes maximum slop
 179 MB total memory in use (0 MB lost due to fragmentation)

Tot time (elapsed)  Avg pause  Max pause
  Gen  0   336 colls, 0 par0.44s0.44s 0.0013s0.0033s
  Gen  1 7 colls, 0 par0.39s0.40s 0.0570s0.1968s

  INITtime0.00s  (  0.00s elapsed)
  MUT time0.23s  (  0.23s elapsed)
  GC  time0.83s  (  0.84s elapsed)
  EXITtime0.00s  (  0.00s elapsed)
  Total   time1.06s  (  1.07s elapsed)

  %GC time  77.9%  (78.3% elapsed)

  Alloc rate749,153,093 bytes per MUT second

  Productivity  22.1% of total user, 22.0% of total elapsed


How come this program uses 179 MB of memory? I'm on a 64-bit machine
where 2'000'000 integers uses 32 MB. Where does the overhead come
from?

-- 
Johan Brinch

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Memory management issue in notmuch-haskell bindings

2011-08-16 Thread Ben Gamari

It seems that the notmuch-haskell bindings (version 0.2.2 built against
notmuch from git master; passes notmuch-test) aren't dealing with memory
management properly. In particular, the attached test code[1] causes
talloc to abort.  Unfortunately, while the issue is consistently
reproducible, it only occurs with some queries (see source[1]). I have
been unable to establish the exact criterion for failure.

It seems that the crash is caused by an invalid access to a freed Query
object while freeing a Messages object (see Valgrind trace[3]). I've
taken a brief look at the bindings themselves but, being only minimally
familiar with the FFI, there's nothing obviously wrong (the finalizers
passed to newForeignPtr look sane). I was under the impression that
talloc was reference counted, so the Query object shouldn't have been
freed unless if there was still a Messages object holding a
reference. Any idea what might have gone wrong here?  Thanks!

Cheers,

- Ben



[1] Test case,

import Data.List
import Control.Monad
import System.Environment
import Foreign.Notmuch

dbpath = "/home/ben/.mail"

getAddresses :: Database -> String -> IO [String]
getAddresses db q = do
query <- queryCreate db q
msgs <- queryMessages query
addrs <- mapM (flip messageGetHeader $ "From") msgs
return addrs

main = do
db <- databaseOpen dbpath DatabaseModeReadOnly
--addrs2 <- getAddresses db "tag:haskell" -- This succeeds
addrs3 <- getAddresses db "to:dietz" -- This fails

--print addrs2
--print addrs3

databaseClose db



[2] Crashed session and backtrace,

[1217 ben@ben-laptop ~] $ ghc test.hs -auto-all -rtsopts -prof && gdb ./test 
GNU gdb (Ubuntu/Linaro 7.2-1ubuntu11) 7.2
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
...
Reading symbols from /home/ben/test...(no debugging symbols found)...done.
(gdb) run
Starting program: /home/ben/test 
[Thread debugging using libthread_db enabled]

Program received signal SIGABRT, Aborted.
0x75979d05 in raise (sig=6) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:64
64  ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
in ../nptl/sysdeps/unix/sysv/linux/raise.c
(gdb) bt
#0  0x75979d05 in raise (sig=6) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x7597dab6 in abort () at abort.c:92
#2  0x76de038c in talloc_abort (reason=0x76de56e8 "Bad talloc magic 
value - access after free") at ../talloc.c:210
#3  0x76de0271 in talloc_abort_access_after_free (ptr=0x769190, 
location=0x77bd4e04 "lib/messages.c:142") at ../talloc.c:229
#4  talloc_chunk_from_ptr (ptr=0x769190, location=0x77bd4e04 
"lib/messages.c:142") at ../talloc.c:250
#5  _talloc_free (ptr=0x769190, location=0x77bd4e04 "lib/messages.c:142") 
at ../talloc.c:1164
#6  0x77bc7e65 in notmuch_messages_destroy (messages=0x769190) at 
lib/messages.c:142
#7  0x004de1c9 in scheduleFinalizers ()
#8  0x004e013d in GarbageCollect ()
#9  0x004d9e40 in scheduleDoGC.clone.18 ()
#10 0x004db0e0 in exitScheduler ()
#11 0x004d9066 in hs_exit_ ()
#12 0x004d940a in shutdownHaskellAndExit ()
#13 0x004d8a91 in real_main ()
#14 0x004d8ade in hs_main ()
#15 0x75964eff in __libc_start_main (main=0x408ed0 , argc=1, 
ubp_av=0x7fffe4f8, init=, fini=, 
rtld_fini=, stack_end=0x7fffe4e8) at 
libc-start.c:226
#16 0x00407791 in _start ()
(gdb) 


[3] Valgrind output,

==25241== Memcheck, a memory error detector
==25241== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
==25241== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
==25241== Command: ./test
==25241== 
==25241== Conditional jump or move depends on uninitialised value(s)
==25241==at 0x52BB510: inflateReset2 (in 
/lib/x86_64-linux-gnu/libz.so.1.2.3.4)
==25241==by 0x52BB605: inflateInit2_ (in 
/lib/x86_64-linux-gnu/libz.so.1.2.3.4)
==25241==by 0x5F211BE: ChertTable::lazy_alloc_inflate_zstream() const 
(chert_table.cc:1672)
==25241==by 0x5F23B06: ChertTable::read_tag(Cursor*, std::string*, bool) 
const (chert_table.cc:1264)
==25241==by 0x5F260F9: ChertTable::get_exact_entry(std::string const&, 
std::string&) const (chert_table.cc:1210)
==25241==by 0x5F26DE2: 
ChertTermList::ChertTermList(Xapian::Internal::RefCntPtr, 
unsigned int) (chert_termlist.cc:44)
==25241==by 0x5EFF2E5: ChertDatabase::open_term_list(unsigned int) const 
(chert_database.cc:891)
==25241==by 0x5E7E7FB: Xapian::Document::termlist_begin() const 
(omdo

Re: [Haskell-cafe] Memory and Threads - MVars or TVars

2010-07-29 Thread Thomas DuBuisson

On Thu, Jul 29, 2010 at 6:53 AM, Job Vranish  wrote:
>
> You might try pulling downloading the package ('cabal fetch org'  will do
> this) and changing the base dependency (to >= 4.1) in the orc.cabal file

cabal also has an 'unpack' command for the particularly lazy (me).  Ex:

 cabal unpack orc ; sed "s/base\W*>= 4.2/base >= 4.1/" orc*/*.cabal ;
cd orc* ; cabal install

Should unpack, fix the .cabal file, and install.

Cheers,
Thomas
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory and Threads - MVars or TVars

2010-07-29 Thread Job Vranish

You might try pulling downloading the package ('cabal fetch org'  will do
this) and changing the base dependency (to >= 4.1) in the orc.cabal file and
then build it manually (cabal configure && cabal build && cabal install
(while in the same directory as the .cabal file)) and see what happens.

I don't see any obvious reasons why it would need a version greater than
6.10, so it might just be an over restrictive dependency rule, but I might
be missing something.

- Job

On Wed, Jul 28, 2010 at 10:49 PM, Eitan Goldshtrom
wrote:

> Ah! That clears that up a lot. I read the wiki page but something just
> didn't make full sense about it until you used the word "prevent". I
> understand that the computer doesn't actually prevent other threads from
> running -- that would defeat the purpose of the concurrency -- but it helped
> clear it up. Perhaps you guys could help me with Cabal now though? I'm
> trying to install Orc but it wants base>=4.2 and <=4.3 and I have 4.1 after
> installing the latest release of GHC. Cabal won't upgrade the base. It
> complains about a dependency to "integer-simple". Anyone know what that's
> about?
>
>
> -Eitan
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory and Threads - MVars or TVars

2010-07-29 Thread Ben Millwood

On Thu, Jul 29, 2010 at 3:49 AM, Eitan Goldshtrom
 wrote:
> Perhaps you guys could help me with Cabal now though? I'm
> trying to install Orc but it wants base>=4.2 and <=4.3 and I have 4.1 after
> installing the latest release of GHC. Cabal won't upgrade the base. It
> complains about a dependency to "integer-simple". Anyone know what that's
> about?
>

The latest version of GHC (6.12) comes with base 4.2. In general, base
versions are tied to compiler versions, and you can't upgrade base on
its own.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory and Threads - MVars or TVars

2010-07-28 Thread Christopher Done

This could be useful: Beautiful concurrency by Simon Peyton Jones

http://research.microsoft.com/en-us/um/people/simonpj/papers/stm/beautiful.pdf

On 29 July 2010 02:23, Eitan Goldshtrom  wrote:
> Hi everyone. I was wondering if someone could just guide me toward some good
> information, but if anyone wants to help with a personal explanation I
> welcome it. I'm trying to write a threaded program and I'm not sure how to
> manage my memory. I read up on MVars and they make a lot of sense. My real
> question is what is "atomic" and how does it apply to TVars? I don't
> understand what atomic transactions are and I can't seem to find a concise
> explanation. I also saw some stuff about TMVars? But I can't find much on
> them either. Any help would be appreciated.
>
> -Eitan
>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory and Threads - MVars or TVars

2010-07-28 Thread Eitan Goldshtrom

Ah! That clears that up a lot. I read the wiki page but something just 
didn't make full sense about it until you used the word "prevent". I 
understand that the computer doesn't actually prevent other threads from 
running -- that would defeat the purpose of the concurrency -- but it 
helped clear it up. Perhaps you guys could help me with Cabal now 
though? I'm trying to install Orc but it wants base>=4.2 and <=4.3 and I 
have 4.1 after installing the latest release of GHC. Cabal won't upgrade 
the base. It complains about a dependency to "integer-simple". Anyone 
know what that's about?


-Eitan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory and Threads - MVars or TVars

2010-07-28 Thread Job Vranish

Atomic operations are special operations where you don't have to worry about
some other process messing with things while the operation is taking place.

For a simple example of why atomic operations are important:
(taken from: http://en.wikipedia.org/wiki/Linearizability#Non-atomic)

The naive, non-atomic implementation:

   1. reads the value in the memory location;
   2. adds one to the value;
   3. writes the new value back into the memory location.

Now, imagine two processes are running incrementing a single, shared memory
location:

   1. the first process reads the value in memory location;
   2. the first process adds one to the value;

but before it can write the new value back to the memory location it is
suspended, and the second process is allowed to run:

   1. the second process reads the value in memory location, the *same* value
   that the first process read;
   2. the second process adds one to the value;
   3. the second process writes the new value into the memory location.

The second process is suspended and the first process allowed to run again:

   1. the first process writes a now-wrong value into the memory location,
   unaware that the other process has already updated the value in the memory
   location.

Atomic operations fix this problem by preventing (STM is a little fancier
and doesn't actually _prevent_, but you can pretend that it does) any other
process from writing to the memory in question until the computation is
finished and the result is written back.

For many simple cases something like atomicModifyIORef is all you really
need. However, if you have cases where you need to make sure _multiple_
IORefs/MVars/TVars/etc.. are not written to until you're finished then you
really need something like STMs 'atomically' function. Which runs a block of
STM operations "atomically".

Hope that helps,

- Job

On Wed, Jul 28, 2010 at 8:23 PM, Eitan Goldshtrom wrote:

>  Hi everyone. I was wondering if someone could just guide me toward some
> good information, but if anyone wants to help with a personal explanation I
> welcome it. I'm trying to write a threaded program and I'm not sure how to
> manage my memory. I read up on MVars and they make a lot of sense. My real
> question is what is "atomic" and how does it apply to TVars? I don't
> understand what atomic transactions are and I can't seem to find a concise
> explanation. I also saw some stuff about TMVars? But I can't find much on
> them either. Any help would be appreciated.
>
> -Eitan
>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Memory and Threads - MVars or TVars

2010-07-28 Thread Eitan Goldshtrom

Hi everyone. I was wondering if someone could just guide me toward some 
good information, but if anyone wants to help with a personal 
explanation I welcome it. I'm trying to write a threaded program and I'm 
not sure how to manage my memory. I read up on MVars and they make a lot 
of sense. My real question is what is "atomic" and how does it apply to 
TVars? I don't understand what atomic transactions are and I can't seem 
to find a concise explanation. I also saw some stuff about TMVars? But I 
can't find much on them either. Any help would be appreciated.


-Eitan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] memory needed for SAX parsing XML

2010-04-23 Thread Daniil Elovkov


John Lato wrote:



Another (additional) approach would be to encapsulate unsafeInterleaveIO
within some routine and not let it go out into the wild.

lazilyDoWithIO :: IO a -> (a -> b) -> IO b

It would use unsafeInterleave internally but catch all IO errors within
itself.

I wonder if this is a reasonable idea? Maybe it's already done?
So the topic is shifting...


doWithIO :: NFData b => IO a -> (a -> b) -> IO b
doWithIO m f = liftM (\a -> let b = f a in b `deepseq` b) m

It works (just stick it in a "try" block for error handling), but you
need to write a lot of NFData instances.  You also need to be careful
that b is some sort of reduced structure, or you can end up forcing
the whole file (or other data) into memory.


I meant a different thing. In your example there is no unsafeInterleave 
at all. I think you mean that 'm' argument is supposed to be an 
unsafeInterleaved io action, like getContents, and deepseq'ing saves us 
from it hanging somewhere for a long time. Ok


But I meant to have a routine that lets us use ordinary io actions in a 
lazy way, and restricting that 'hanging' within bounds of this routine.


But having thought about it a little more I understood that it's impossible.

Lazy io works now because unsafeInterleaveIO is sticked into getContents 
itself, and is called repeatedly (via recursion). Or at least I can 
think of this implementation, haven't looked into it.


I realised that calling unsafeInterleaveIO for an ordinary io action 
will not make it run lazily. It will, still, run all-at-once, just not 
now, but later.


So, to cope with it, I can think of exposing a little structure of an io 
action. Normally it's completely opaque. But if we knew where its 
recursion point lies, then we could control its course of execution.


So, if we had something like
  type RecIO a = IO a -> IO a
and io actions were like
  getContents :: RecIO [Char]
  getContents rec = do
  c <- readOneChar
  rest <- rec
  return (c:rest)

then we could either run them
  normally :: RecIO a -> IO a
  normally r = r (normally r)
or
  lazily :: RecIO a -> IO a
  lazily r = unsafeInterleavIO $ r (lazily r)

And
  lazilyDoWithIO :: RecIO a -> (a -> b) -> IO b
  lazilyDoWithIO m f = do
a <- lazily m
return $ f a

Hmm, but then, we would have to take special care to not let it out of 
this function anyway... So here we come to deepSeq'ing you proposed.


And anyway, instead of re-writing pure functions to become iteratees we 
will have to re-write io functions to adopt continuation passing style.


Initially it looked better to me :)

But with this approach we can run lazily any io action that has the form 
of RecIO. Also, we can interleave normally and lazily based on the time 
of day and other conditions :)




It also doesn't help with
other IO effects, e.g. writing output.  I consider this one of the
nicest features of iteratee-based processing.


Can you clarify what's the problem with writing? I think I just haven't 
switched from the topic of gluing code.


Because as for gluing code, type signature for io writing d -> IO () is 
perfectly sufficient.



--
Daniil Elovkov
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] memory needed for SAX parsing XML

2010-04-22 Thread John Lato

> Message: 8
> Date: Tue, 20 Apr 2010 12:08:36 +0400
> From: Daniil Elovkov 
> Subject: Re: [Haskell-cafe] memory needed for SAX parsing XML
> To: Haskell-Cafe 
> Message-ID: <4bcd6104.50...@googlemail.com>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> Jason Dagit wrote:
>>
>>
>> On Mon, Apr 19, 2010 at 3:01 AM, Daniil Elovkov
>> mailto:daniil.elov...@googlemail.com>>
>> wrote:
>
>> I think iteratees are slowly catching on as an alternative to lazy io.
>> Basically, the iteratee approach uses a left fold style to stream the
>> data and process it in chunks, including some exception handling.
>> Unfortunately, I think it may also require a special sax parser that is
>> specifically geared towards iteratee use.  Having an iteratee based sax
>> parser would make processing large xml streams very convenient in
>> haskell.  Hint, hint, if you want to write a library :)  (Or, maybe it
>> exists, I admit that I haven't checked.)
>
>
> Iteratees  seem like a natural thing if we want to completely avoid
> unsafeInterleaveIO. But the presence of the latter is so good for
> modularity.
>
> We can glue IO code and pure code of the type String -> a so seamlessly.
> In case of iteratees, as I understand, pure functions of the type String
> -> a would no longer be usable with IO String result. The signature (and
> the code itself) would have to be changed to be left fold.

To some extent, yes, although the amount of work required can vary.

The general rule of thumb is that functions can be used directly with
iteratees if they can work strictly.  Functions that rely on laziness
need some adaptation, although how much varies.

If you've built your string handling functions out of a parser
combinator library, e.g. parsec-3 or attoparsec, you can just lift the
parser into an iteratee and use all your existing functions.

Since you're using HaXml, this should work.  The only missing part is
a polyparse-iteratee converter.  I haven't used polyparse, but it
looks like the converter would be similar to the one used for
attoparsec in the attoparsec-iteratee package.

That said, it's not how I would do it.  Since SAX is a stream
processor, an iteratee-based SAX implementation would be a good fit.
I would write a lexer and parser (using a parser combinator library)
and then use those with Data.Iteratee.convStream.  That's how I would
write an iteratee-based SAX parser.  HaXml already includes a suitable
lexer and parser, but unfortunately they're not exposed.

>
> Another (additional) approach would be to encapsulate unsafeInterleaveIO
> within some routine and not let it go out into the wild.
>
> lazilyDoWithIO :: IO a -> (a -> b) -> IO b
>
> It would use unsafeInterleave internally but catch all IO errors within
> itself.
>
> I wonder if this is a reasonable idea? Maybe it's already done?
> So the topic is shifting...

doWithIO :: NFData b => IO a -> (a -> b) -> IO b
doWithIO m f = liftM (\a -> let b = f a in b `deepseq` b) m

It works (just stick it in a "try" block for error handling), but you
need to write a lot of NFData instances.  You also need to be careful
that b is some sort of reduced structure, or you can end up forcing
the whole file (or other data) into memory.  It also doesn't help with
other IO effects, e.g. writing output.  I consider this one of the
nicest features of iteratee-based processing.

John
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] memory needed for SAX parsing XML

2010-04-20 Thread Daniil Elovkov


Jason Dagit wrote:



On Mon, Apr 19, 2010 at 3:01 AM, Daniil Elovkov 
mailto:daniil.elov...@googlemail.com>> 
wrote:


Hello haskellers!

I'm trying to process an xml file with as little footprint as
possible. SAX is alright for my case, and I think that's the
lightest way possible. So, I'm looking at HaXml.SAX

I'm surprised to see that it takes about 56-60 MB of ram. This seems
constant relative to xml file size, which is expected. Only slightly
depends on it as I recursively traverse the list of sax events. But
it seems like too much.


For me these sorts of problems always involve investigation into the 
root cause.  I'm just not good enough at predicting what is causing the 
memory consumption.  Thankfully, GHC has great tools for this sort of 
investigative work.  The book real-world haskell documents how to use 
those tools:

http://book.realworldhaskell.org/read/profiling-and-optimization.html

If you haven't already, I highly recommend looking at the profiling 
graphs.  See if you can figure out if your program has any space leaks.


No, I didn't profile it yet. I just made up the simplest example (while 
evaluating a few xml parsers) and saw the result.


Given the triviality of the example I considered it could be a "feature" 
of HaXml SAX parser, and that somebody would say that it is indeed so.





The size of the file is from 1MB to 20MB.

The code is something like this

main = do
   (fn:_) <- getArgs
   h <- openFile fn ReadMode
   c <- hGetContents h
   let out = proc $ fst $ saxParse fn c
   putStrLn out
   getChar


For such a simple program you won't run into any problem with lazy IO, 
but as your program grows in complexity it will very likely come back to 
bite you.  If you're not familiar with lazy IO, I'm referring to the 
hGetContents.  Some example problems:






Thanks, I'm aware of that. Speaking of this, I've always felt a bit 
uncomfortable about lazy IO. I've just looked at the safe-lazy-io 
package now.


I think iteratees are slowly catching on as an alternative to lazy io.  
Basically, the iteratee approach uses a left fold style to stream the 
data and process it in chunks, including some exception handling.  
Unfortunately, I think it may also require a special sax parser that is 
specifically geared towards iteratee use.  Having an iteratee based sax 
parser would make processing large xml streams very convenient in 
haskell.  Hint, hint, if you want to write a library :)  (Or, maybe it 
exists, I admit that I haven't checked.)



Iteratees  seem like a natural thing if we want to completely avoid 
unsafeInterleaveIO. But the presence of the latter is so good for 
modularity.


We can glue IO code and pure code of the type String -> a so seamlessly. 
In case of iteratees, as I understand, pure functions of the type String 
-> a would no longer be usable with IO String result. The signature (and 
the code itself) would have to be changed to be left fold.


Another (additional) approach would be to encapsulate unsafeInterleaveIO 
within some routine and not let it go out into the wild.


lazilyDoWithIO :: IO a -> (a -> b) -> IO b

It would use unsafeInterleave internally but catch all IO errors within 
itself.


I wonder if this is a reasonable idea? Maybe it's already done?
So the topic is shifting...

--
Daniil Elovkov
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] memory needed for SAX parsing XML

2010-04-19 Thread Jason Dagit

On Mon, Apr 19, 2010 at 3:01 AM, Daniil Elovkov <
daniil.elov...@googlemail.com> wrote:

> Hello haskellers!
>
> I'm trying to process an xml file with as little footprint as possible. SAX
> is alright for my case, and I think that's the lightest way possible. So,
> I'm looking at HaXml.SAX
>
> I'm surprised to see that it takes about 56-60 MB of ram. This seems
> constant relative to xml file size, which is expected. Only slightly depends
> on it as I recursively traverse the list of sax events. But it seems like
> too much.
>

For me these sorts of problems always involve investigation into the root
cause.  I'm just not good enough at predicting what is causing the memory
consumption.  Thankfully, GHC has great tools for this sort of investigative
work.  The book real-world haskell documents how to use those tools:
http://book.realworldhaskell.org/read/profiling-and-optimization.html

If you haven't already, I highly recommend looking at the profiling graphs.
See if you can figure out if your program has any space leaks.

>
> The size of the file is from 1MB to 20MB.
>
> The code is something like this
>
> main = do
>(fn:_) <- getArgs
>h <- openFile fn ReadMode
>c <- hGetContents h
>let out = proc $ fst $ saxParse fn c
>putStrLn out
>getChar
>

For such a simple program you won't run into any problem with lazy IO, but
as your program grows in complexity it will very likely come back to bite
you.  If you're not familiar with lazy IO, I'm referring to the
hGetContents.  Some example problems:
1) If you opened many files this way, you could run out of file handles
(lazy IO closing of handles is unpredictable, but file handles are a scarce
resource).  The safe-io package on hackage can help you avoid this
particular pitfall.
2) Reading of the file will happen during your pure code.  This implies that
IO exceptions can happen in your pure code.  It also means that in some ways
you'll be able to observe side-effects in your pure code.
3) If you were to reference 'c' from two places in main, the GC would not
collect any of it until both references were collectable.  To avoid that
leak, you'd need to load the data twice to avoid the memory leak.

I'm sure there are other things that can go wrong that i've missed.

I think iteratees are slowly catching on as an alternative to lazy io.
Basically, the iteratee approach uses a left fold style to stream the data
and process it in chunks, including some exception handling.  Unfortunately,
I think it may also require a special sax parser that is specifically geared
towards iteratee use.  Having an iteratee based sax parser would make
processing large xml streams very convenient in haskell.  Hint, hint, if you
want to write a library :)  (Or, maybe it exists, I admit that I haven't
checked.)

I hope that helps,
Jason
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] memory needed for SAX parsing XML

2010-04-19 Thread Daniil Elovkov


Hello haskellers!

I'm trying to process an xml file with as little footprint as possible. 
SAX is alright for my case, and I think that's the lightest way 
possible. So, I'm looking at HaXml.SAX


I'm surprised to see that it takes about 56-60 MB of ram. This seems 
constant relative to xml file size, which is expected. Only slightly 
depends on it as I recursively traverse the list of sax events. But it 
seems like too much.


The size of the file is from 1MB to 20MB.

The code is something like this

main = do
(fn:_) <- getArgs
h <- openFile fn ReadMode
c <- hGetContents h
let out = proc $ fst $ saxParse fn c
putStrLn out
getChar

-- 'f' is tail-recursive
-- 'i' is used only to ensure that the list is completely traversed

proc es = show $ f es 0
 where f :: [SaxElement] -> Int -> Int
   f (SaxElementOpen name _ : rest) i = f rest i
   f (_ : rest) i = f rest i
   f [] i = i


Thanks for you thoughts!

--
Daniil Elovkov
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory-aware Haskell?

2009-12-25 Thread Daniel Fischer

Am Freitag 25 Dezember 2009 15:45:29 schrieb Gwern Branwen:
> On Fri, Dec 25, 2009 at 5:14 AM, Svein Ove Aas  wrote:
> > On Thu, Dec 24, 2009 at 11:38 PM, Roman Cheplyaka  wrote:
> >> So, let's think what we can do at runtime. Suppose RTS takes the
> >> parameter -- upper limit of consumed memory. When it sees that memory
> >> consumption is close to upper bound, it can:
> >>
> >> 1. force garbage collection
> >
> > This is already implemented. See the -M RTS option.
>
> Correct me if I'm wrong - it's been a while since I tried to use the
> -M option to deal with Gitit memory usage on darcs.net - but doesn't
> -M just kill the program after it reaches a set RAM, and doesn't
> trigger any GCs?

It does affect GC behaviour. According to the user's guide:

"
-Msize 
[Default: unlimited] Set the maximum heap size to size bytes. The heap normally 
grows and 
shrinks according to the memory requirements of the program. The only reason 
for having 
this option is to stop the heap growing without bound and filling up all the 
available 
swap space, which at the least will result in the program being summarily 
killed by the 
operating system.
The maximum heap size also affects other garbage collection parameters: when 
the amount of 
live data in the heap exceeds a certain fraction of the maximum heap size, 
compacting 
collection will be automatically enabled for the oldest generation, and the -F 
parameter 
will be reduced in order to avoid exceeding the maximum heap size."
 

Reducing the -F parameter (amount of memory reserved for the older generations 
as a factor 
of the amount of live data) triggers more GCs. So it tries to not exceed the 
limit you've 
given. You can indeed reduce the memory used by your programme via the -M 
option, but only 
to some extent. As far as I know it will not collect data held on to by live 
data to be 
reconstructed later if needed again.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory-aware Haskell?

2009-12-25 Thread Gwern Branwen

On Fri, Dec 25, 2009 at 5:14 AM, Svein Ove Aas  wrote:
> On Thu, Dec 24, 2009 at 11:38 PM, Roman Cheplyaka  wrote:
>> So, let's think what we can do at runtime. Suppose RTS takes the parameter --
>> upper limit of consumed memory. When it sees that memory consumption is
>> close to upper bound, it can:
>>
>> 1. force garbage collection
>>
> This is already implemented. See the -M RTS option.

Correct me if I'm wrong - it's been a while since I tried to use the
-M option to deal with Gitit memory usage on darcs.net - but doesn't
-M just kill the program after it reaches a set RAM, and doesn't
trigger any GCs?

-- 
gwern
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory-aware Haskell?

2009-12-25 Thread Svein Ove Aas

On Thu, Dec 24, 2009 at 11:38 PM, Roman Cheplyaka  wrote:
> So, let's think what we can do at runtime. Suppose RTS takes the parameter --
> upper limit of consumed memory. When it sees that memory consumption is
> close to upper bound, it can:
>
> 1. force garbage collection
>
This is already implemented. See the -M RTS option.

> 2. apply some heuristics to find and reduce some chunks which will
>   benefit from reduction in terms of size
>
For example, by un-evaluating them. It would often be possible to
reduce size like that, just so long as they aren't evaluated again the
next second. Well, this one would be hard.

> 3. if nothing helps, throw an exeption. It can be caught in IO and
>   memory-aware program can make apropriate decision -- e.g. abort
>   opening a large file and gracefully warn the user.
>
That should be relative simple. Get implementing, will you?:D

> (And there still is a problem of foreign code whose memory consumption
> we know nothing about...)
>
In theory, you could deal with that by hooking malloc.

-- 
Svein Ove Aas
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory-aware Haskell?

2009-12-24 Thread John Van Enk

This is a problem with partitioned operating systems used in avionics. The
airplane computers require certain partitions to exist between programs in
both time and space. The space guarantees are most easily enforced by
eliminating any dynamic memory allocation once the operating system enters a
'normal mode'. This means that all memory needs to be allocated on the stack
and ahead of time.

This reduces the calculation of the upper memory bound to checking what the
largest possible stack a process could grow to is. Of course, calculating
this maximum stack is difficult in its own right.

Unfortunately, nothing I've just discussed seems to port well to Haskell (I
could be wrong!). Given this, I think you're looking at a very hard problem
(but one I'd be incredibly interested in discovering a solution for!).

/jve

On Thu, Dec 24, 2009 at 5:38 PM, Roman Cheplyaka  wrote:

> Imagine some system with hard memory bound (e.g. 64M of physical memory,
> no swap). I don't want some accidental laziness (maybe even not in my
> code, but in some used package) to crash my program.
>
> So, let's think what we can do at runtime. Suppose RTS takes the parameter
> --
> upper limit of consumed memory. When it sees that memory consumption is
> close to upper bound, it can:
>
> 1. force garbage collection
>
> 2. apply some heuristics to find and reduce some chunks which will
>   benefit from reduction in terms of size
>
> 3. if nothing helps, throw an exeption. It can be caught in IO and
>   memory-aware program can make apropriate decision -- e.g. abort
>   opening a large file and gracefully warn the user.
>
> (And there still is a problem of foreign code whose memory consumption
> we know nothing about...)
>
> What do you think?
>
> --
> Roman I. Cheplyaka :: http://ro-che.info/
> "Don't let school get in the way of your education." - Mark Twain
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Memory-aware Haskell?

2009-12-24 Thread Roman Cheplyaka

Imagine some system with hard memory bound (e.g. 64M of physical memory,
no swap). I don't want some accidental laziness (maybe even not in my
code, but in some used package) to crash my program.

So, let's think what we can do at runtime. Suppose RTS takes the parameter --
upper limit of consumed memory. When it sees that memory consumption is
close to upper bound, it can:

1. force garbage collection

2. apply some heuristics to find and reduce some chunks which will
   benefit from reduction in terms of size

3. if nothing helps, throw an exeption. It can be caught in IO and
   memory-aware program can make apropriate decision -- e.g. abort
   opening a large file and gracefully warn the user.

(And there still is a problem of foreign code whose memory consumption
we know nothing about...)

What do you think?

-- 
Roman I. Cheplyaka :: http://ro-che.info/
"Don't let school get in the way of your education." - Mark Twain
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory Leak - Artificial Neural Network

2009-11-05 Thread wren ng thornton


Hector Guilarte wrote:

Hi Luke,

The code is mainly in Spanish with son parts in English...

Thanks for the explanation, I got the idea very well, but now I got some 
questions about that.

How does the Prelude functions for managing lists work? I mean, what does zip, unzip, 
foldl, foldr, map and zipWith do? Tail recursion or corecursion? I know, thanks to the 
profiling I did, that my main memory leak is in the function "entrenamiento" 
(which means training in Spanish), and I hardly believe it is in when I use of those 
functions I mentioned before, so if they are tail recursive and I change them to my own 
corecursive version, maybe I'll get rid of the problem, won't I?


Don't worry about the Prelude definitions, by and large they're the 
"right" definitions. If you're curious then just search for Prelude.hs 
on your system, or check it out online[1].


As a more general high-level suggestion, the most efficient way to 
implement feedforward ANNs is to treat them as matrix multiplication 
problems and use matrices/arrays rather than lists. For a three layer 
network of N, M, and O nodes we thus:

* start with an N-wide vector of inputs
* multiply by the N*M matrix of weights, to get an M-vector
* map sigmoid or other activation function
* multiply by the M*O matrix of weights for the next layer to get 
an O-vector

* apply some interpretation (e.g. winner-take-all) to the output

There are various libraries for optimized matrix multiplication, but 
even just using an unboxed array for the matrices will make it much 
faster to traverse through things.



[1] http://haskell.org/ghc/docs/latest/html/libraries/base/Prelude.html
See the "Source Code" link at the top of the page.

--
Live well,
~wren
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory usage of cabal install

2009-04-27 Thread Krzysztof Kościuszkiewicz

On Mon, Apr 27, 2009 at 02:10:28PM +0100, Duncan Coutts wrote:

> If you're using ghc 6.10 then the solution is to update to cabal-install
> 0.6.x. If you're quite sure you are using 6.8 then the bug is unknown.
> It may still be worth trying upgrading to cabal-install 0.6.x.

I've upgraded to cabal-install 0.6.2 and the problem went away.

Thanks for help!
-- 
Krzysztof Kościuszkiewicz
"Simplicity is the ultimate sophistication" -- Leonardo da Vinci
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory usage of cabal install

2009-04-27 Thread Krzysztof Kościuszkiewicz

On Mon, Apr 27, 2009 at 02:10:28PM +0100, Duncan Coutts wrote:

> > [...]
> > Increasing verbosity does not help, memory consumption goes up after the
> > message "Resolving dependencies..." shows up.
> > 
> > I use ghc 6.8.2 and cabal-install version 0.5.1 using version 1.4.0.1 of
> > the Cabal library.
> 
> The only instance of this behaviour that I know of involves using that
> version of cabal-install with ghc-6.10.1.
> 
> http://haskell.org/cabal/FAQ.html#cabal-goes-into-an-infinite-loop--runs-out-of-memory
>
> [...]
>
> Are you absolutely sure you are using ghc 6.8 and not 6.10?

Yes:

> k...@copper:~$ ghc -V ; cabal -V
> The Glorious Glasgow Haskell Compilation System, version 6.8.2
> cabal-install version 0.5.1
> using version 1.4.0.1 of the Cabal library 

> > Is there a workaround? I would like to avoid fetching & building packages
> > manually.
> 
> If you're using ghc 6.10 then the solution is to update to cabal-install
> 0.6.x. If you're quite sure you are using 6.8 then the bug is unknown.
> It may still be worth trying upgrading to cabal-install 0.6.x.

I'll try that and report success/failure.

Thanks,
-- 
Krzysztof Kościuszkiewicz
"Simplicity is the ultimate sophistication" -- Leonardo da Vinci
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory usage of cabal install

2009-04-27 Thread Duncan Coutts

On Mon, 2009-04-27 at 12:56 +0200, Krzysztof Kościuszkiewicz wrote:
> Hello Haskell-Café,
> 
> I have a problem with high memory usage of cabal-install.  Whenever I
> try to install or upgrade a package, cabal manages to consume 1,3G of
> memory before I killed it (on a 32-bit machine with 1 GB of memory).
> 
> Increasing verbosity does not help, memory consumption goes up after the
> message "Resolving dependencies..." shows up.
> 
> I use ghc 6.8.2 and cabal-install version 0.5.1 using version 1.4.0.1 of
> the Cabal library.

The only instance of this behaviour that I know of involves using that
version of cabal-install with ghc-6.10.1.

http://haskell.org/cabal/FAQ.html#cabal-goes-into-an-infinite-loop--runs-out-of-memory

The older version of cabal-install goes into an infinite loop when using
ghc-6.10, when it is presented with two installed versions of the base
package (and one depends on the other). In version 0.6.x cabal-install
was updated to understand this situation.

Are you absolutely sure you are using ghc 6.8 and not 6.10?

> Is there a workaround? I would like to avoid fetching & building packages
> manually.

If you're using ghc 6.10 then the solution is to update to cabal-install
0.6.x. If you're quite sure you are using 6.8 then the bug is unknown.
It may still be worth trying upgrading to cabal-install 0.6.x.

Duncan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory usage of cabal install

2009-04-27 Thread Krzysztof Skrzętnicki

Jakiej platformy dokładnie dotyczy Twój problem?
Proponuję upgrade do najnowszej wersji - można ją ściągnąć ze strony GHC.

Pozdrawiam

Krzysztof Skrzętnicki

2009/4/27 Krzysztof Kościuszkiewicz :
> Hello Haskell-Café,
>
> I have a problem with high memory usage of cabal-install.  Whenever I
> try to install or upgrade a package, cabal manages to consume 1,3G of
> memory before I killed it (on a 32-bit machine with 1 GB of memory).
>
> Increasing verbosity does not help, memory consumption goes up after the
> message "Resolving dependencies..." shows up.
>
> I use ghc 6.8.2 and cabal-install version 0.5.1 using version 1.4.0.1 of
> the Cabal library.
>
> Is there a workaround? I would like to avoid fetching & building packages
> manually.
> --
> Krzysztof Kościuszkiewicz
> "Simplicity is the ultimate sophistication" -- Leonardo da Vinci
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Memory usage of cabal install

2009-04-27 Thread Krzysztof Kościuszkiewicz

Hello Haskell-Café,

I have a problem with high memory usage of cabal-install.  Whenever I
try to install or upgrade a package, cabal manages to consume 1,3G of
memory before I killed it (on a 32-bit machine with 1 GB of memory).

Increasing verbosity does not help, memory consumption goes up after the
message "Resolving dependencies..." shows up.

I use ghc 6.8.2 and cabal-install version 0.5.1 using version 1.4.0.1 of
the Cabal library.

Is there a workaround? I would like to avoid fetching & building packages
manually.
-- 
Krzysztof Kościuszkiewicz
"Simplicity is the ultimate sophistication" -- Leonardo da Vinci
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory usage when passing arrays in state

2009-03-03 Thread Daniel Fischer

Am Mittwoch, 4. März 2009 02:30 schrieb Tobias Olausson:
> Thank you Daniel.
> As I understood it DiffArrays are supposed to be faster than the regular

They may be supposed to be faster, but they aren't.
If you want anything resembling speed, use UArrays, STUArrays, or, if your 
array elements cannot be unboxed, plain Arrays and STArrays, or some other 
array-package from Hackage (I don't know which are good, though, the above 
are good enough for me).

> Array due to the fact that it doesnt copy the entire Array, but just
> updates the position that changes, and keeps some kind of "changelog" on
> the array. But when looking at the statistics for my sample program, it
> seems that it allocates a lot more than what should be needed, which would
> indicate that maybe the array is copied anyway.
> At this point, the DiffArray/DiffUArray are the only functional arrays,
> right?

No. They are probably the most dysfunctional arrays around.

> I mean, I can add two and two together and see that it
> equals...four, and if the only functional array is sort of broken, that
> means so is my program. Are there any alternatives that are fast aswell?
>
> //Tobias

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory usage when passing arrays in state

2009-03-03 Thread Ryan Ingram

I've found DiffArrays to be way too slow/memory-hogging for real usage.

Since you are in IO already (StateT s IO), you'll probably be better
off using a mutable array for a data structure.

Some things are still best done in the imperative style.  You can be a
bit safer by using ST as the bottom monad, however, and returning the
result as a pure array.  This gives you a single copy per "runST".

Alternatively, use IntMap instead of arrays if you want a pure data
structure with reasonably efficient update/lookup.

  -- ryan

On Tue, Mar 3, 2009 at 4:44 PM, Tobias Olausson  wrote:
> Hello all.
> I am currently implementing an emulation of a CPU, in which the CPU's
> RAM is part of the internal state
> that is passed around in the program using a state monad. However, the
> program performs
> unexpectingly bad, and some profiling information makes us believe
> that the problem is the high
> memory usage of the program.
>
> The program below is similar to our main program used when testing a
> sorting algorithm in this CPU:
>
> module Main where
>
> import Control.Monad.State.Lazy
> import Data.Word
> import Data.Array.Diff
> import Control.Concurrent (threadDelay)
>
> data LoopState = LoopState
>    { intVal :: Integer
>    , diff   :: DiffUArray Word8 Word8
>    }
>
> initState :: LoopState
> initState = LoopState 0 (array (0x00,0xFF) [(idx,0)|idx<-[0x00..0xFF]])
>
> main :: IO ()
> main = do
>    execStateT looper initState >>= putStrLn . show . intVal
>
> looper :: StateT LoopState IO ()
> looper = do
>    st <- get
>    let res = intVal st + 1
>        idx = fromIntegral res
>    put $ st { intVal = res, diff = (diff st) // [(idx,idx)] }
>    if res == 1300
>        then return ()
>        else looper
>
> Of course our program does more than updating a counter ;-)
> Compiling and running this program yields the following result:
>
> [~]:[olaussot] >> ghc --make -O2 -o array ArrayPlay.hs
> [~]:[olaussot] >> ./array +RTS -sstderr
> ./array +RTS -sstderr
> 1300
>     313,219,740 bytes allocated in the heap
>   1,009,986,984 bytes copied during GC
>     200,014,828 bytes maximum residency (8 sample(s))
>       4,946,648 bytes maximum slop
>             393 MB total memory in use (3 MB lost due to fragmentation)
>
>  Generation 0:   590 collections,     0 parallel,  3.06s,  3.09s elapsed
>  Generation 1:     8 collections,     0 parallel,  3.56s,  4.21s elapsed
>
>  INIT  time    0.00s  (  0.00s elapsed)
>  MUT   time    0.27s  (  0.27s elapsed)
>  GC    time    6.62s  (  7.30s elapsed)
>  EXIT  time    0.00s  (  0.00s elapsed)
>  Total time    6.89s  (  7.57s elapsed)
>
>  %GC time      96.1%  (96.4% elapsed)
>
>  Alloc rate    1,155,958,754 bytes per MUT second
>
>  Productivity   3.9% of total user, 3.6% of total elapsed
>
> Why does the program spend 96.1% of its total running time collecting garbage?
> Any tips to make this program perform better are appreciated.
> Please do tell if anything is unclear.
>
> --
> Tobias Olausson
> tob...@gmail.com
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory usage when passing arrays in state

2009-03-03 Thread Daniel Fischer

Am Mittwoch, 4. März 2009 01:44 schrieb Tobias Olausson:
> Hello all.
> I am currently implementing an emulation of a CPU, in which the CPU's
> RAM is part of the internal state
> that is passed around in the program using a state monad. However, the
> program performs
> unexpectingly bad, and some profiling information makes us believe
> that the problem is the high
> memory usage of the program.
>
> The program below is similar to our main program used when testing a
> sorting algorithm in this CPU:
>
> module Main where
>
> import Control.Monad.State.Lazy

Not good, use Control.Monad.State.Strict

> import Data.Word
> import Data.Array.Diff
> import Control.Concurrent (threadDelay)
>
> data LoopState = LoopState
> { intVal :: Integer
> , diff   :: DiffUArray Word8 Word8

Diff(U)Arrays tend to be slow, use them with care.

> }
>
> initState :: LoopState
> initState = LoopState 0 (array (0x00,0xFF) [(idx,0)|idx<-[0x00..0xFF]])
>
> main :: IO ()
> main = do
> execStateT looper initState >>= putStrLn . show . intVal
>
> looper :: StateT LoopState IO ()
> looper = do
> st <- get
> let res = intVal st + 1
> idx = fromIntegral res
> put $ st { intVal = res, diff = (diff st) // [(idx,idx)] }
> if res == 1300
> then return ()
> else looper

You're being too lazy, building a huge thunk that only gets evaluated at the 
end of the loop. You have to force evaluation earlier.

>
> Of course our program does more than updating a counter ;-)
> Compiling and running this program yields the following result:
>
> [~]:[olaussot] >> ghc --make -O2 -o array ArrayPlay.hs
> [~]:[olaussot] >> ./array +RTS -sstderr
> ./array +RTS -sstderr
> 1300
>  313,219,740 bytes allocated in the heap
>1,009,986,984 bytes copied during GC
>  200,014,828 bytes maximum residency (8 sample(s))
>4,946,648 bytes maximum slop
>  393 MB total memory in use (3 MB lost due to fragmentation)
>
>   Generation 0:   590 collections, 0 parallel,  3.06s,  3.09s elapsed
>   Generation 1: 8 collections, 0 parallel,  3.56s,  4.21s elapsed
>
>   INIT  time0.00s  (  0.00s elapsed)
>   MUT   time0.27s  (  0.27s elapsed)
>   GCtime6.62s  (  7.30s elapsed)
>   EXIT  time0.00s  (  0.00s elapsed)
>   Total time6.89s  (  7.57s elapsed)
>
>   %GC time  96.1%  (96.4% elapsed)
>
>   Alloc rate1,155,958,754 bytes per MUT second
>
>   Productivity   3.9% of total user, 3.6% of total elapsed
>
> Why does the program spend 96.1% of its total running time collecting
> garbage? Any tips to make this program perform better are appreciated.
> Please do tell if anything is unclear.

Nothing gets evaluated until the end, so nothing can be discarded earlier.

--
{-# LANGUAGE BangPatterns #-}
module Main where

import Control.Monad.State.Strict
import Data.Word
import Data.Array.Unboxed
import Data.Array.ST
import Data.Array.MArray

update :: UArray Word8 Word8 -> Word8 -> Word8 -> UArray Word8 Word8
update arr i v = runSTUArray $ do
sar <- unsafeThaw arr
writeArray sar i v
return sar

data LoopState = LoopState
{ intVal :: !Integer
, diff   :: !(UArray Word8 Word8)
}

initState :: LoopState
initState = LoopState 0 (array (0x00,0xFF) [(idx,0)|idx<-[0x00 .. 0xFF]])

main :: IO ()
main = do
execStateT looper initState >>= putStrLn . show . intVal

looper :: StateT LoopState IO ()
looper = do
LoopState i df <- get
let res = i + 1
idx = fromIntegral res
!ndf = update df idx idx
put (LoopState res ndf)
if res == 1300
then return ()
else looper
--

Is much better behaved. I didn't investigate if every strictness annotation is 
necessary.

>
> --
> Tobias Olausson
> tob...@gmail.com

Cheers,
Daniel

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory usage when passing arrays in state

2009-03-03 Thread Tobias Olausson

Thank you Daniel.
As I understood it DiffArrays are supposed to be faster than the regular
Array due to the fact that it doesnt copy the entire Array, but just updates
the position that changes, and keeps some kind of "changelog" on the array.
But when looking at the statistics for my sample program, it seems that
it allocates a lot more than what should be needed, which would indicate that
maybe the array is copied anyway.
At this point, the DiffArray/DiffUArray are the only functional arrays, right?
I mean, I can add two and two together and see that it equals...four, and
if the only functional array is sort of broken, that means so is my program.
Are there any alternatives that are fast aswell?

//Tobias

2009/3/4 Daniel Peebles :
> This may be completely unrelated to your problem, but there's a ticket
> in the GHC trac saying that DiffArray is unusably slow:
> http://hackage.haskell.org/trac/ghc/ticket/2727 . It doesn't analyze
> the cause of the slowness, so it's quite possible that it may be
> related to GC as in your case.
>
> Cheers,
> Dan
>
> On Tue, Mar 3, 2009 at 7:44 PM, Tobias Olausson  wrote:
>> Hello all.
>> I am currently implementing an emulation of a CPU, in which the CPU's
>> RAM is part of the internal state
>> that is passed around in the program using a state monad. However, the
>> program performs
>> unexpectingly bad, and some profiling information makes us believe
>> that the problem is the high
>> memory usage of the program.
>>
>> The program below is similar to our main program used when testing a
>> sorting algorithm in this CPU:
>>
>> module Main where
>>
>> import Control.Monad.State.Lazy
>> import Data.Word
>> import Data.Array.Diff
>> import Control.Concurrent (threadDelay)
>>
>> data LoopState = LoopState
>>    { intVal :: Integer
>>    , diff   :: DiffUArray Word8 Word8
>>    }
>>
>> initState :: LoopState
>> initState = LoopState 0 (array (0x00,0xFF) [(idx,0)|idx<-[0x00..0xFF]])
>>
>> main :: IO ()
>> main = do
>>    execStateT looper initState >>= putStrLn . show . intVal
>>
>> looper :: StateT LoopState IO ()
>> looper = do
>>    st <- get
>>    let res = intVal st + 1
>>        idx = fromIntegral res
>>    put $ st { intVal = res, diff = (diff st) // [(idx,idx)] }
>>    if res == 1300
>>        then return ()
>>        else looper
>>
>> Of course our program does more than updating a counter ;-)
>> Compiling and running this program yields the following result:
>>
>> [~]:[olaussot] >> ghc --make -O2 -o array ArrayPlay.hs
>> [~]:[olaussot] >> ./array +RTS -sstderr
>> ./array +RTS -sstderr
>> 1300
>>     313,219,740 bytes allocated in the heap
>>   1,009,986,984 bytes copied during GC
>>     200,014,828 bytes maximum residency (8 sample(s))
>>       4,946,648 bytes maximum slop
>>             393 MB total memory in use (3 MB lost due to fragmentation)
>>
>>  Generation 0:   590 collections,     0 parallel,  3.06s,  3.09s elapsed
>>  Generation 1:     8 collections,     0 parallel,  3.56s,  4.21s elapsed
>>
>>  INIT  time    0.00s  (  0.00s elapsed)
>>  MUT   time    0.27s  (  0.27s elapsed)
>>  GC    time    6.62s  (  7.30s elapsed)
>>  EXIT  time    0.00s  (  0.00s elapsed)
>>  Total time    6.89s  (  7.57s elapsed)
>>
>>  %GC time      96.1%  (96.4% elapsed)
>>
>>  Alloc rate    1,155,958,754 bytes per MUT second
>>
>>  Productivity   3.9% of total user, 3.6% of total elapsed
>>
>> Why does the program spend 96.1% of its total running time collecting 
>> garbage?
>> Any tips to make this program perform better are appreciated.
>> Please do tell if anything is unclear.
>>
>> --
>> Tobias Olausson
>> tob...@gmail.com
>> ___
>> Haskell-Cafe mailing list
>> Haskell-Cafe@haskell.org
>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>>
>



-- 
Tobias Olausson
tob...@gmail.com
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory usage when passing arrays in state

2009-03-03 Thread Daniel Peebles

This may be completely unrelated to your problem, but there's a ticket
in the GHC trac saying that DiffArray is unusably slow:
http://hackage.haskell.org/trac/ghc/ticket/2727 . It doesn't analyze
the cause of the slowness, so it's quite possible that it may be
related to GC as in your case.

Cheers,
Dan

On Tue, Mar 3, 2009 at 7:44 PM, Tobias Olausson  wrote:
> Hello all.
> I am currently implementing an emulation of a CPU, in which the CPU's
> RAM is part of the internal state
> that is passed around in the program using a state monad. However, the
> program performs
> unexpectingly bad, and some profiling information makes us believe
> that the problem is the high
> memory usage of the program.
>
> The program below is similar to our main program used when testing a
> sorting algorithm in this CPU:
>
> module Main where
>
> import Control.Monad.State.Lazy
> import Data.Word
> import Data.Array.Diff
> import Control.Concurrent (threadDelay)
>
> data LoopState = LoopState
>    { intVal :: Integer
>    , diff   :: DiffUArray Word8 Word8
>    }
>
> initState :: LoopState
> initState = LoopState 0 (array (0x00,0xFF) [(idx,0)|idx<-[0x00..0xFF]])
>
> main :: IO ()
> main = do
>    execStateT looper initState >>= putStrLn . show . intVal
>
> looper :: StateT LoopState IO ()
> looper = do
>    st <- get
>    let res = intVal st + 1
>        idx = fromIntegral res
>    put $ st { intVal = res, diff = (diff st) // [(idx,idx)] }
>    if res == 1300
>        then return ()
>        else looper
>
> Of course our program does more than updating a counter ;-)
> Compiling and running this program yields the following result:
>
> [~]:[olaussot] >> ghc --make -O2 -o array ArrayPlay.hs
> [~]:[olaussot] >> ./array +RTS -sstderr
> ./array +RTS -sstderr
> 1300
>     313,219,740 bytes allocated in the heap
>   1,009,986,984 bytes copied during GC
>     200,014,828 bytes maximum residency (8 sample(s))
>       4,946,648 bytes maximum slop
>             393 MB total memory in use (3 MB lost due to fragmentation)
>
>  Generation 0:   590 collections,     0 parallel,  3.06s,  3.09s elapsed
>  Generation 1:     8 collections,     0 parallel,  3.56s,  4.21s elapsed
>
>  INIT  time    0.00s  (  0.00s elapsed)
>  MUT   time    0.27s  (  0.27s elapsed)
>  GC    time    6.62s  (  7.30s elapsed)
>  EXIT  time    0.00s  (  0.00s elapsed)
>  Total time    6.89s  (  7.57s elapsed)
>
>  %GC time      96.1%  (96.4% elapsed)
>
>  Alloc rate    1,155,958,754 bytes per MUT second
>
>  Productivity   3.9% of total user, 3.6% of total elapsed
>
> Why does the program spend 96.1% of its total running time collecting garbage?
> Any tips to make this program perform better are appreciated.
> Please do tell if anything is unclear.
>
> --
> Tobias Olausson
> tob...@gmail.com
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Memory usage when passing arrays in state

2009-03-03 Thread Tobias Olausson

Hello all.
I am currently implementing an emulation of a CPU, in which the CPU's
RAM is part of the internal state
that is passed around in the program using a state monad. However, the
program performs
unexpectingly bad, and some profiling information makes us believe
that the problem is the high
memory usage of the program.

The program below is similar to our main program used when testing a
sorting algorithm in this CPU:

module Main where

import Control.Monad.State.Lazy
import Data.Word
import Data.Array.Diff
import Control.Concurrent (threadDelay)

data LoopState = LoopState
{ intVal :: Integer
, diff   :: DiffUArray Word8 Word8
}

initState :: LoopState
initState = LoopState 0 (array (0x00,0xFF) [(idx,0)|idx<-[0x00..0xFF]])

main :: IO ()
main = do
execStateT looper initState >>= putStrLn . show . intVal

looper :: StateT LoopState IO ()
looper = do
st <- get
let res = intVal st + 1
idx = fromIntegral res
put $ st { intVal = res, diff = (diff st) // [(idx,idx)] }
if res == 1300
then return ()
else looper

Of course our program does more than updating a counter ;-)
Compiling and running this program yields the following result:

[~]:[olaussot] >> ghc --make -O2 -o array ArrayPlay.hs
[~]:[olaussot] >> ./array +RTS -sstderr
./array +RTS -sstderr
1300
 313,219,740 bytes allocated in the heap
   1,009,986,984 bytes copied during GC
 200,014,828 bytes maximum residency (8 sample(s))
   4,946,648 bytes maximum slop
 393 MB total memory in use (3 MB lost due to fragmentation)

  Generation 0:   590 collections, 0 parallel,  3.06s,  3.09s elapsed
  Generation 1: 8 collections, 0 parallel,  3.56s,  4.21s elapsed

  INIT  time0.00s  (  0.00s elapsed)
  MUT   time0.27s  (  0.27s elapsed)
  GCtime6.62s  (  7.30s elapsed)
  EXIT  time0.00s  (  0.00s elapsed)
  Total time6.89s  (  7.57s elapsed)

  %GC time  96.1%  (96.4% elapsed)

  Alloc rate1,155,958,754 bytes per MUT second

  Productivity   3.9% of total user, 3.6% of total elapsed

Why does the program spend 96.1% of its total running time collecting garbage?
Any tips to make this program perform better are appreciated.
Please do tell if anything is unclear.

--
Tobias Olausson
tob...@gmail.com
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] memory-efficient data type for Netflix data - UArray Int Int vs UArray Int Word8

2009-03-01 Thread Manlio Perillo


Kenneth Hoste ha scritto:

Hello,

I'm having a go at the Netflix Prize using Haskell. Yes, I'm brave.

I kind of have an algorithm in mind that I want to implement using Haskell,
but up until now, the main issue has been to find a way to efficiently 
represent

the data...

For people who are not familiar with the Netflix data, in short, it 
consist of
roughly 100M (1e8) user ratings (1-5, integer) for 17,770 different 
movies, coming from

480,109 different users.



Hi Kenneth.

I have written a simple program that parses the Netflix training data 
set, using this data structure:


type MovieRatings = IntMap (UArr Word32, UArr Word8)

The ratings are grouped by movies.

The parsing is done in:
real8m32.476s
user3m5.276s
sys 0m8.681s

On a DELL Inspiron 6400 notebook,
Intel Core2 T7200 @ 2.00GHz, and 2 GB memory.


However the memory used is about 1.4 GB.
How did you manage to get 700 MB memory usage?


Note that the minimum space required is about 480 MB (assuming 4 byte 
integer for the ID, and 1 byte integer for rating).
Using a 4 byte integer for both ID and rating, the space required is 
about 765 MB.


1.5 GB is the space required if one uses a total of 16 bytes to store 
both the ID and the rating.




Maybe it is the garbage collector that does not release memory to the 
operating system?




Thanks  Manlio Perillo
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re[2]: [Haskell-cafe] memory issues

2009-02-28 Thread Bulat Ziganshin

Hello Daniel,

Saturday, February 28, 2009, 6:20:09 PM, you wrote:
> But they would not be equivalent if stdout has to be locked for each output
> operation separately, but a file opened with openFile fp WriteMode  was
> locked then once and remained so until closed.

ghc Handles are locked for every operation, not for entire file
lifetime. it's done in order to make safe concurrent operations in
threads of the same program


-- 
Best regards,
 Bulatmailto:bulat.zigans...@gmail.com

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] memory issues

2009-02-28 Thread Daniel Fischer

Hello Bulat,
Am Samstag, 28. Februar 2009 09:38 schrieb Bulat Ziganshin:
> Hello Daniel,
>
> Saturday, February 28, 2009, 3:10:44 AM, you wrote:
> >> print may waste a lot of time, locking stdout for every
> >> line printed
> >
> > hout <- openFile (args!!1) WriteMode
> > mapM_ (hPrint hout) $ sort $ blocks content
> >
> > ? I find hardly any difference, though.
>
> no difference. if handle is locked for every output operation - both
> versions will be equivalent. if not - they also will be equivalent :)

But they would not be equivalent if stdout has to be locked for each output 
operation separately, but a file opened with openFile fp WriteMode  was 
locked then once and remained so until closed.
Since I don't know the internals, I thought it was worth taking a look.
Seems they behave the same.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re[2]: [Haskell-cafe] memory issues

2009-02-28 Thread Bulat Ziganshin

Hello Daniel,

Saturday, February 28, 2009, 3:10:44 AM, you wrote:

>> print may waste a lot of time, locking stdout for every
>> line printed

> hout <- openFile (args!!1) WriteMode
> mapM_ (hPrint hout) $ sort $ blocks content

> ? I find hardly any difference, though.

no difference. if handle is locked for every output operation - both
versions will be equivalent. if not - they also will be equivalent :)


-- 
Best regards,
 Bulatmailto:bulat.zigans...@gmail.com

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] memory issues

2009-02-27 Thread Daniel Fischer

Am Samstag, 28. Februar 2009 00:37 schrieb Bulat Ziganshin:
> Hello Daniel,
>
> Saturday, February 28, 2009, 2:21:31 AM, you wrote:
> >>   printf "%s" $ unlines $ map (show) (sort $! blocks content)
> >
> > Bad!
> > Use
> > mapM_ print $ sort $ blocks content
>
> are you sure?

Tested it. The printf "%s" is very bad. Replacing that reduced allocation and 
GC time by a factor of 2+.
The difference between 

mapM_ print

and

putStr $ unlines $ map show $ ...

is too small for me to be sure that mapM_ print is really better.

> print may waste a lot of time, locking stdout for every
> line printed

Hm, maybe

main = do
args <- getArgs
content <- B.readFile (args!!0)
hout <- openFile (args!!1) WriteMode
mapM_ (hPrint hout) $ sort $ blocks content
hClose hout

? I find hardly any difference, though.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re[2]: [Haskell-cafe] memory issues

2009-02-27 Thread Bulat Ziganshin

Hello Don,

Saturday, February 28, 2009, 2:18:37 AM, you wrote:

> offset :: !Integer

oh yes

> And possibly just using {-# UNPACK #-}!Int64 would be ok?

i think that it will be even better but main problem is a
huge unevaluated thunks. as the last hope, this may be converted to

x <- getOffsets
y <- getSizes x
z <- sort y

also, "zipWith (Block) offsets sizes" and getSizes may be combined to
make only one pass through data (although it's not highly important
since sort anyway will need them all). really good technique would be
to use some sort of heap to hold only 10-100 largest page sizes

btw,

data Block = Block {
  size::Integer
, offset::Integer
} deriving (Ord)

allows to omit instance Ord Block definition

-- 
Best regards,
 Bulatmailto:bulat.zigans...@gmail.com

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re[2]: [Haskell-cafe] memory issues

2009-02-27 Thread Bulat Ziganshin

Hello Daniel,

Saturday, February 28, 2009, 2:21:31 AM, you wrote:

>>   printf "%s" $ unlines $ map (show) (sort $! blocks content)

> Bad!
> Use
> mapM_ print $ sort $ blocks content

are you sure? print may waste a lot of time, locking stdout for every
line printed

$! is really useless here - it will require to compute only *head* of
list before calling sort which is absolutely useless :)

-- 
Best regards,
 Bulatmailto:bulat.zigans...@gmail.com

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] memory issues

2009-02-27 Thread Daniel Fischer

Am Freitag, 27. Februar 2009 23:18 schrieb Rogan Creswick:
>
> \begin{code}
> -- Compiled with:
> -- $ ghc --make offsetSorter.hs
> --
> --  (ghc v. 6.8.2)
> --
> -- Run with:
> -- $ time ./offsetSorter data/byteOffsets.txt > haskOffsets.txt
> -- offsetSorter: out of memory (requested 1048576 bytes)
> --
> -- real   4m12.130s
> -- user   3m4.812s
> -- sys0m5.660s
> --(OOM happened after consuming just over 3000mb of Virt, 2.6gb Res,
> according to top.)
> --
>
> import System (getArgs)
> import Data.Maybe
> import Monad
> import Text.Printf (printf)
> import Data.Function (on)
> import Data.List (sort)
> import Data.ByteString (ByteString)
> import qualified Data.ByteString.Char8 as C8
> import qualified Data.ByteString as B
>
>
> -- get the lines
> -- parse each line to get the offset.
> -- scan the list of offsets
>
> -- | The full file size:
> maxSize :: Integer
> maxSize = 2807444044080
>
> -- | Block is a contiguous chunk of data.
> -- The first entry is the offset, the second is the length.
> data Block = Block {
>   offset::Integer
> , size::Integer
> } deriving (Eq)
>
> -- | Ordering of Blocks is based entirely on the block size.
> instance Ord Block where
> compare = compare `on` size
>
> instance Show Block where
> show (Block o s) = (show o) ++ "  " ++ (show s)
>
> -- turn the file into a list of offsets:
> getOffsets :: ByteString -> [Integer]
> getOffsets = catMaybes . map parseOffset . C8.lines
>
> -- | Pull out the offsets frome a line of the file.
> parseOffset :: ByteString -> Maybe Integer
> parseOffset s = do
>   (i, _) <- C8.readInteger (C8.filter (/=':') s)

Why the C8.filter (/= ':')?
That just costs and doesn't help anything (in fact, if your file contains 
lines like 1234:5678, it gives wrong results).

If you know that your file contains only lines of the form offset: ,
you can have

getOffsets = map (fst . fromJust . C8.readInteger) . C8.lines

which seems to do a little good.

>   Just i
>
> -- | Get the offsets between entries in a list
> getSizes :: [Integer]  -> [Integer]
> getSizes (x:y:[]) = [y - x]
> getSizes (x:y:ys) = (y - x):(getSizes (y:ys))
>
> -- | creates and returns a list of Blocks, given a file's content.
> blocks :: ByteString -> [Block]
> blocks s = zipWith (Block) offsets sizes
>where offsets = getOffsets s
>  sizes   = getSizes (offsets ++ [maxSize])
>
> main :: IO ()
> main = do
>   args <- getArgs
>   content <- B.readFile (args!!0)


>   printf "%s" $ unlines $ map (show) (sort $! blocks content)

Bad!
Use
mapM_ print $ sort $ blocks content

> \end{code}

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] memory issues

2009-02-27 Thread Don Stewart

bulat.ziganshin:
> Hello Rogan,
> 
> Saturday, February 28, 2009, 1:18:47 AM, you wrote:
> 
> > data Block = Block {
> >   offset::Integer
> > , size::Integer
> > } deriving (Eq)
> 
> try
>!offset::Integer
>  , !size::Integer
> 

offset :: !Integer

And possibly just using {-# UNPACK #-}!Int64 would be ok?

-- Don
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] memory issues

2009-02-27 Thread Bulat Ziganshin

Hello Rogan,

Saturday, February 28, 2009, 1:18:47 AM, you wrote:

> data Block = Block {
>   offset::Integer
> , size::Integer
> } deriving (Eq)

try
   !offset::Integer
 , !size::Integer


-- 
Best regards,
 Bulatmailto:bulat.zigans...@gmail.com

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] memory issues

2009-02-27 Thread Don Stewart

creswick:
> On Fri, Feb 27, 2009 at 2:20 PM, Don Stewart  wrote:
> > creswick:
> >> \begin{code}
> >> -- Compiled with:
> >> -- $ ghc --make offsetSorter.hs
> >
> > YIKES!! Use the optimizer!
> >
> >    ghc -O2 --make
> 
> Ah, that did drastically cut the amount of time it takes to run out of
> memory (down to 1:23), but unfortunately I can't see any other
> improvements -- the memory consumed seems to be about the same.
> (granted, I have no indication of progress -- it may be getting
> significantly more done, but it's not quite over the hump and
> producing output yet.)
> 

Ok. Now, profile! (ghc -O2 -prof -auto-all --make)

http://book.realworldhaskell.org/read/profiling-and-optimization.html
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] memory issues

2009-02-27 Thread Rogan Creswick

On Fri, Feb 27, 2009 at 2:20 PM, Don Stewart  wrote:
> creswick:
>> \begin{code}
>> -- Compiled with:
>> -- $ ghc --make offsetSorter.hs
>
> YIKES!! Use the optimizer!
>
>    ghc -O2 --make

Ah, that did drastically cut the amount of time it takes to run out of
memory (down to 1:23), but unfortunately I can't see any other
improvements -- the memory consumed seems to be about the same.
(granted, I have no indication of progress -- it may be getting
significantly more done, but it's not quite over the hump and
producing output yet.)

--Rogan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] memory issues

2009-02-27 Thread Don Stewart

creswick:
> First off, my apologies for breaking etiquette, if/when I do -- I've
> only just joined Haskell-cafe, and I'm quite new to Haskell.
> 
> I have recently been trying to process a large data set (the 2.8tb
> wikipedia data dump), and also replace my scripting needs with haskell
> (needs that have previously been filled with bash, perl, and bits of
> Java).  Last week I needed to do some quick scanning of the (7zipped)
> wikipedia dump to get a feel for the size of articles, and from that
> determine the best way to process the whole enchilada... cutting to
> the chase, I ended up with a file consisting of byte offsets and lines
> matched by a grep pattern (a 250mb file).  Specifically, 11m lines of:
> 
> 1405:  
> 14062:  
> 15979:  
> 18665:  
> 920680797:  
> ..
> 2807444041476:  
> 2807444043623:  
> 
> I needed to know how large the lagest  elements were, so I'd
> know if they would fit in memory, and some idea of how many would
> cause swapping, etc. So, I wrote a simple app in haskell (below) to
> find the sizes of each  and sort them.  Unfortunately, it
> consumes an absurd amount of memory (3+gb) and dies with an
> out-of-memory error.  Given the input size, and what it is doing, this
> seems ridiculously high -- can anyone help me understand what is going
> on, and how I can prevent this sort of rampant memory use?
> 
> I can provide a link to the input file if anyone wants it, but it
> doesn't seem particularly useful, given the simplicity and size.
> Since I needed to get results fairly quickly, I've re-implemented this
> in java, so that reference implementation is also available should
> anyone want it (the approach that is most similar to the haskell
> requires a 1.4gb heap, but by streaming the string->long parsing, that
> requirement drops to ~600mb, which seems pretty reasonable, since the
> *output* is 215mb.)
> 
> Thanks!
> Rogan
> 
> \begin{code}
> -- Compiled with:
> -- $ ghc --make offsetSorter.hs


YIKES!! Use the optimizer!

ghc -O2 --make


-- Don
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] memory issues

2009-02-27 Thread Rogan Creswick

First off, my apologies for breaking etiquette, if/when I do -- I've
only just joined Haskell-cafe, and I'm quite new to Haskell.

I have recently been trying to process a large data set (the 2.8tb
wikipedia data dump), and also replace my scripting needs with haskell
(needs that have previously been filled with bash, perl, and bits of
Java).  Last week I needed to do some quick scanning of the (7zipped)
wikipedia dump to get a feel for the size of articles, and from that
determine the best way to process the whole enchilada... cutting to
the chase, I ended up with a file consisting of byte offsets and lines
matched by a grep pattern (a 250mb file).  Specifically, 11m lines of:

1405:  
14062:  
15979:  
18665:  
920680797:  
..
2807444041476:  
2807444043623:  

I needed to know how large the lagest  elements were, so I'd
know if they would fit in memory, and some idea of how many would
cause swapping, etc. So, I wrote a simple app in haskell (below) to
find the sizes of each  and sort them.  Unfortunately, it
consumes an absurd amount of memory (3+gb) and dies with an
out-of-memory error.  Given the input size, and what it is doing, this
seems ridiculously high -- can anyone help me understand what is going
on, and how I can prevent this sort of rampant memory use?

I can provide a link to the input file if anyone wants it, but it
doesn't seem particularly useful, given the simplicity and size.
Since I needed to get results fairly quickly, I've re-implemented this
in java, so that reference implementation is also available should
anyone want it (the approach that is most similar to the haskell
requires a 1.4gb heap, but by streaming the string->long parsing, that
requirement drops to ~600mb, which seems pretty reasonable, since the
*output* is 215mb.)

Thanks!
Rogan

\begin{code}
-- Compiled with:
-- $ ghc --make offsetSorter.hs
--
--  (ghc v. 6.8.2)
--
-- Run with:
-- $ time ./offsetSorter data/byteOffsets.txt > haskOffsets.txt
-- offsetSorter: out of memory (requested 1048576 bytes)
--
-- real 4m12.130s
-- user 3m4.812s
-- sys  0m5.660s
--(OOM happened after consuming just over 3000mb of Virt, 2.6gb Res,
according to top.)
--

import System (getArgs)
import Data.Maybe
import Monad
import Text.Printf (printf)
import Data.Function (on)
import Data.List (sort)
import Data.ByteString (ByteString)
import qualified Data.ByteString.Char8 as C8
import qualified Data.ByteString as B


-- get the lines
-- parse each line to get the offset.
-- scan the list of offsets

-- | The full file size:
maxSize :: Integer
maxSize = 2807444044080

-- | Block is a contiguous chunk of data.
-- The first entry is the offset, the second is the length.
data Block = Block {
  offset::Integer
, size::Integer
} deriving (Eq)

-- | Ordering of Blocks is based entirely on the block size.
instance Ord Block where
compare = compare `on` size

instance Show Block where
show (Block o s) = (show o) ++ "  " ++ (show s)

-- turn the file into a list of offsets:
getOffsets :: ByteString -> [Integer]
getOffsets = catMaybes . map parseOffset . C8.lines

-- | Pull out the offsets frome a line of the file.
parseOffset :: ByteString -> Maybe Integer
parseOffset s = do
  (i, _) <- C8.readInteger (C8.filter (/=':') s)
  Just i

-- | Get the offsets between entries in a list
getSizes :: [Integer]  -> [Integer]
getSizes (x:y:[]) = [y - x]
getSizes (x:y:ys) = (y - x):(getSizes (y:ys))

-- | creates and returns a list of Blocks, given a file's content.
blocks :: ByteString -> [Block]
blocks s = zipWith (Block) offsets sizes
   where offsets = getOffsets s
 sizes   = getSizes (offsets ++ [maxSize])

main :: IO ()
main = do
  args <- getArgs
  content <- B.readFile (args!!0)
  printf "%s" $ unlines $ map (show) (sort $! blocks content)
\end{code}
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] memory-efficient data type for Netflix data - UArray Int Int vs UArray Int Word8

2009-02-26 Thread Manlio Perillo


Kenneth Hoste ha scritto:

[...]
However, as I posted yesterday, I've been able to circumvent the issue 
by rethinking my data type, i.e. using
the ~18K movie IDs as key instead of the 480K user IDs, which radically 
limits the overhead...


Well, but what if you really need the original data structure, for 
better data processing?


That way, I'm able to fit the data set in <700M of memory, without 
having to reorganize the raw data.


The uvector package implements a vector of unboxed types, and has an 
snocU operation, to append an element to the array.


I don't know how efficient it is, however.



By the way, about uvector: it has a Stream data type, and you can 
build a vector from a stream.


Thanks for letting me know, I'll keep this in mind.



Let me know if there are performance improvements.

Arrays are one of the few things I dislike in Haskell, and all the 
available array/vector packages cause me some confusion.





Regards   Manlio
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] memory-efficient data type for Netflix data - UArray Int Int vs UArray Int Word8

2009-02-26 Thread Kenneth Hoste



On Feb 26, 2009, at 13:00 , Manlio Perillo wrote:


Kenneth Hoste ha scritto:

Hello,
I'm having a go at the Netflix Prize using Haskell. Yes, I'm brave.
[...]
To see if I could efficiently represent the data set in this way, I  
wrote a small

Haskell program (attached) which uses the following data type:


From what I see, to append a new integer to the Array, you convert  
the array to a list, append the new element to the list, and then  
convert to array again.


Isn't this a bit inefficient?


Yes, performance-wise this is terribly inefficient, I agree. But, it  
was just an artefact of how the raw data is organized.


My main concern was the memory usage of the huge IntMap with UArray  
elements.
Once I solved that, I would be able to get around the performance  
issue by reorganizing the raw data.


However, as I posted yesterday, I've been able to circumvent the issue  
by rethinking my data type, i.e. using
the ~18K movie IDs as key instead of the 480K user IDs, which  
radically limits the overhead...
That way, I'm able to fit the data set in <700M of memory, without  
having to reorganize the raw data.


The uvector package implements a vector of unboxed types, and has an  
snocU operation, to append an element to the array.


I don't know how efficient it is, however.



By the way, about uvector: it has a Stream data type, and you can  
build a vector from a stream.


Thanks for letting me know, I'll keep this in mind.

greetings,

Kenneth

--

Kenneth Hoste
Paris research group - ELIS - Ghent University, Belgium
email: kenneth.ho...@elis.ugent.be
website: http://www.elis.ugent.be/~kehoste
blog: http://boegel.kejo.be

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] memory-efficient data type for Netflix data - UArray Int Int vs UArray Int Word8

2009-02-26 Thread Manlio Perillo


Kenneth Hoste ha scritto:

Hello,

I'm having a go at the Netflix Prize using Haskell. Yes, I'm brave.

[...]
To see if I could efficiently represent the data set in this way, I 
wrote a small

Haskell program (attached) which uses the following data type:



From what I see, to append a new integer to the Array, you convert the 
array to a list, append the new element to the list, and then convert to 
array again.


Isn't this a bit inefficient?

The uvector package implements a vector of unboxed types, and has an 
snocU operation, to append an element to the array.


I don't know how efficient it is, however.


By the way, about uvector: it has a Stream data type, and you can build 
a vector from a stream.


But how this work and how (if any) the stream data is integrated with 
other packages?

The package documentations seems to be still incomplete.

> [...]



Regards  Manlio Perillo
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] memory-efficient data type for Netflix data - UArray Int Int vs UArray Int Word8

2009-02-23 Thread wren ng thornton


Kenneth Hoste wrote:
Well, I'm using UArray, but I'm willing to consider other suitable 
containers...

As long as they are memory efficient. :-)

The typical usage of a UArray will be getting all it's contents,
and converting it to a list to easily manipulate (filter, ...).

So, maybe another data type allows me to store the data in a limited 
amount of memory

(which is my main concern now)...


Have you considered using spectral (or counting) bloom filters? I know 
there's a non-spectral version available[1], though I haven't looked at 
it to see how easily it could be made to count.



[1] http://hackage.haskell.org/cgi-bin/hackage-scripts/package/bloomfilter

--
Live well,
~wren
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] memory-efficient data type for Netflix data - UArray Int Int vs UArray Int Word8

2009-02-23 Thread Kenneth Hoste



On Feb 23, 2009, at 19:57 , Don Stewart wrote:


bos:

2009/2/23 Kenneth Hoste 


   Does anyone know why the Word8 version is not significantly  
better in terms

   of memory usage?


Yes, because there's a typo on line 413 of Data/Array/Vector/Prim/ 
BUArr.hs.


How's that for service? :-)


UArray or UArr?


Well, I'm using UArray, but I'm willing to consider other suitable  
containers...

As long as they are memory efficient. :-)

The typical usage of a UArray will be getting all it's contents,
and converting it to a list to easily manipulate (filter, ...).

So, maybe another data type allows me to store the data in a limited  
amount of memory

(which is my main concern now)...

K.

--

Kenneth Hoste
ELIS - Ghent University
email: kenneth.ho...@elis.ugent.be
blog: http://www.elis.ugent.be/~kehoste/blog
website: http://www.elis.ugent.be/~kehoste

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] memory-efficient data type for Netflix data - UArray Int Int vs UArray Int Word8

2009-02-23 Thread Don Stewart

bos:
> 2009/2/23 Kenneth Hoste 
>  
> 
> Does anyone know why the Word8 version is not significantly better in 
> terms
> of memory usage?
> 
> 
> Yes, because there's a typo on line 413 of Data/Array/Vector/Prim/BUArr.hs.
> 
> How's that for service? :-)

UArray or UArr?
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] memory-efficient data type for Netflix data - UArray Int Int vs UArray Int Word8

2009-02-23 Thread Bryan O'Sullivan

2009/2/23 Kenneth Hoste 


> Does anyone know why the Word8 version is not significantly better in terms
> of memory usage?
>

Yes, because there's a typo on line 413 of Data/Array/Vector/Prim/BUArr.hs.

How's that for service? :-)
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] memory-efficient data type for Netflix data - UArray Int Int vs UArray Int Word8

2009-02-23 Thread Kenneth Hoste


Hello,

I'm having a go at the Netflix Prize using Haskell. Yes, I'm brave.

I kind of have an algorithm in mind that I want to implement using  
Haskell,
but up until now, the main issue has been to find a way to efficiently  
represent

the data...

For people who are not familiar with the Netflix data, in short, it  
consist of
roughly 100M (1e8) user ratings (1-5, integer) for 17,770 different  
movies, coming from

480,109 different users.

The idea I have in mind requires fast access to all the ratings for a  
particular user,

so I was thinking about an IntMap in terms of user ids, which maps to
movie id/rating pairs somehow.

To see if I could efficiently represent the data set in this way, I  
wrote a small

Haskell program (attached) which uses the following data type:


testMemSizeUArray_UArray_Word8.hs
Description: Binary data




type data = IntMap (UArray Int Word8) -- map of user IDs to ratings  
(lacks movie IDs)


For convenience, and because I've been discussing this with various  
people in #haskell @ IRC,
the code is also available here: http://hpaste.org/fastcgi/hpaste.fcgi/view?id=1671#a1671 
 .


I'm aware that the way in which the UArray's are constructed (i.e. by  
adding a single element each
time), is far from efficient performance-wise, but that's not the  
point here...
By reformatting the raw data, I can easily read in the data more  
efficiently.


The issue I ran into is that changing the data type to the following,  
doesn't yield any significant

different in memory usage.

type data = IntMap (UArray Int Int) -- use Int instead of Word8 for a  
user rating


Someone (dolio) @ #haskell suggested that maybe UArray is not byte  
packed for Word8,
which would cause little difference with a UArray containing Int's,  
but someone else (dons @ #ghc)

was able to tell me it _is_ byte packed.

Does anyone know why the Word8 version is not significantly better in  
terms of memory usage?


greetings,

Kenneth

PS: My adventures on trying to tackle the Netflix Prize problem with  
Haskell can be followed at http://boegel.kejo.be.


--

Kenneth Hoste
ELIS - Ghent University
email: kenneth.ho...@elis.ugent.be
blog: http://www.elis.ugent.be/~kehoste/blog
website: http://www.elis.ugent.be/~kehoste

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory

2009-02-18 Thread Jeff Douglas

I'd love to. I'm thinking of starting a blog once I get more
experience and familiarity with the language.

Jeff

On Wed, Feb 18, 2009 at 6:13 PM, Magnus Therning  wrote:
> On Tue, Feb 17, 2009 at 5:53 AM, Jeff Douglas  wrote:
>> Thanks Guys,
>>
>> Not only did I not run optimizations, I misread the profile. It looks
>> like it was an imaginary problem from the beginning. I guess I should
>> go through all the profiling documentation more carefully.
>
> Please share what you find in some way (mail/blog/anything).  It'll
> help me avoid making the same mistakes once I get around to writing
> something in Haskell where I have to care about resources :-)
>
> /M
>
> --
> Magnus Therning(OpenPGP: 0xAB4DFBA4)
> magnus＠therning．org  Jabber: magnus＠therning．org
> http://therning.org/magnus identi.ca|twitter: magthe
>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory

2009-02-18 Thread Magnus Therning

On Tue, Feb 17, 2009 at 5:53 AM, Jeff Douglas  wrote:
> Thanks Guys,
>
> Not only did I not run optimizations, I misread the profile. It looks
> like it was an imaginary problem from the beginning. I guess I should
> go through all the profiling documentation more carefully.

Please share what you find in some way (mail/blog/anything).  It'll
help me avoid making the same mistakes once I get around to writing
something in Haskell where I have to care about resources :-)

/M

-- 
Magnus Therning(OpenPGP: 0xAB4DFBA4)
magnus＠therning．org  Jabber: magnus＠therning．org
http://therning.org/magnus identi.ca|twitter: magthe
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory

2009-02-16 Thread Jeff Douglas

Thanks Guys,

Not only did I not run optimizations, I misread the profile. It looks
like it was an imaginary problem from the beginning. I guess I should
go through all the profiling documentation more carefully.

Jeff

On Tue, Feb 17, 2009 at 2:46 PM, Bernie Pope  wrote:
>
> On 17/02/2009, at 3:56 PM, Jeff Douglas wrote:
>
>> Hello All,
>>
>> The kind people at #haskell suggested I come to haskell-cafe for
>> questions about haskell performance issues.
>> I'm new to haskell, and I'm having a hard time understanding how to
>> deal with memory leaks.
>>
>> I've been playing with some network server examples and I noticed with
>> each new connection, the memory footprint increases by about 7k
>> However, the leaks don't seem to have anything to do with the
>> networking code. Actually I get a huge leak just from using using
>> 'forever'.
>>
>>> import Control.Monad
>>> import System.IO
>>>
>>> main = forever $ putStrLn "hi"
>>
>> When I run it for a few seconds with profiling...
>>
>>> total time  =0.36 secs   (18 ticks @ 20 ms)
>>> total alloc =  54,423,396 bytes  (excludes profiling overheads)
>>
>> Can this be right?
>
>
> I don't think there should be a space leak in the code you posted.
>
> On my mac, OS X 10.5.6, GHC version 6.8.3, it appears to run in constant
> space with or without optimisation.
>
> GHCi seems to gobble a little bit of memory (but that could be incidental).
>
> My terminal application does gobble memory for a while (and then frees it),
> but that is presumably because it is hammering the buffer (and it nearly
> sets my lap on fire when running).
>
> Perhaps you could post more details about how it is compiled, and what
> versions of things are being used.
>
> How are you detecting the leak (via top?).
>
> Cheers,
> Bernie.
>
>
>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory

2009-02-16 Thread Bernie Pope



On 17/02/2009, at 3:56 PM, Jeff Douglas wrote:


Hello All,

The kind people at #haskell suggested I come to haskell-cafe for
questions about haskell performance issues.
I'm new to haskell, and I'm having a hard time understanding how to
deal with memory leaks.

I've been playing with some network server examples and I noticed with
each new connection, the memory footprint increases by about 7k
However, the leaks don't seem to have anything to do with the
networking code. Actually I get a huge leak just from using using
'forever'.


import Control.Monad
import System.IO

main = forever $ putStrLn "hi"


When I run it for a few seconds with profiling...


total time  =0.36 secs   (18 ticks @ 20 ms)
total alloc =  54,423,396 bytes  (excludes profiling overheads)


Can this be right?



I don't think there should be a space leak in the code you posted.

On my mac, OS X 10.5.6, GHC version 6.8.3, it appears to run in  
constant space with or without optimisation.


GHCi seems to gobble a little bit of memory (but that could be  
incidental).


My terminal application does gobble memory for a while (and then frees  
it), but that is presumably because it is hammering the buffer (and it  
nearly sets my lap on fire when running).


Perhaps you could post more details about how it is compiled, and what  
versions of things are being used.


How are you detecting the leak (via top?).

Cheers,
Bernie.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory

2009-02-16 Thread Don Stewart

inbuninbu:
> Hello All,
> 
> The kind people at #haskell suggested I come to haskell-cafe for
> questions about haskell performance issues.
> I'm new to haskell, and I'm having a hard time understanding how to
> deal with memory leaks.
> 
> I've been playing with some network server examples and I noticed with
> each new connection, the memory footprint increases by about 7k
> However, the leaks don't seem to have anything to do with the
> networking code. Actually I get a huge leak just from using using
> 'forever'.
> 
> > import Control.Monad
> > import System.IO
> >
> > main = forever $ putStrLn "hi"
> 
> When I run it for a few seconds with profiling...
> 
> > total time  =0.36 secs   (18 ticks @ 20 ms)
> > total alloc =  54,423,396 bytes  (excludes profiling overheads)
> 
> Can this be right?


did you compile with optimisations on?

$ ghc -O2 A.hs --make
$ time ./A +RTS -sstderr

  17,880 bytes maximum residency (1 sample(s))
  18,984 bytes maximum slop
   1 MB total memory in use (0 MB lost due to fragmentation)

  Generation 0:  9951 collections, 0 parallel,  0.07s,  0.07s elapsed
  Generation 1: 1 collections, 0 parallel,  0.00s,  0.00s elapsed

  INIT  time0.00s  (  0.00s elapsed)
  MUT   time4.45s  ( 16.08s elapsed)
  GCtime0.07s  (  0.07s elapsed)
  EXIT  time0.00s  (  0.00s elapsed)
  Total time4.52s  ( 16.16s elapsed)

  %GC time   1.5%  (0.5% elapsed)

  Alloc rate1,173,414,505 bytes per MUT second

  Productivity  98.5% of total user, 27.5% of total elapsed

./A +RTS -sstderr  4.52s user 10.61s system 93% cpu 16.161 total

So it's allocating small cells, frequently, then discarding them -- and running 
in constant space.

Looking further,

forever :: (Monad m) => m a -> m b
forever a   = a >> forever a

Well, here, the result is thrown away anyway. And the result is (), so I'd 
expect 
constant space. 

Looks good to me. Did you run it without optimisations, on , perhaps?


-- Don
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Memory

2009-02-16 Thread Jeff Douglas

Hello All,

The kind people at #haskell suggested I come to haskell-cafe for
questions about haskell performance issues.
I'm new to haskell, and I'm having a hard time understanding how to
deal with memory leaks.

I've been playing with some network server examples and I noticed with
each new connection, the memory footprint increases by about 7k
However, the leaks don't seem to have anything to do with the
networking code. Actually I get a huge leak just from using using
'forever'.

> import Control.Monad
> import System.IO
>
> main = forever $ putStrLn "hi"

When I run it for a few seconds with profiling...

> total time  =0.36 secs   (18 ticks @ 20 ms)
> total alloc =  54,423,396 bytes  (excludes profiling overheads)

Can this be right?
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

RE: [Haskell-cafe] Memory efficiency questions for real-time graphics

2008-11-05 Thread Tobias Bexelius

I believe Svein is thinking of render-to-texture which indeed could be
used to emulate Stream-Out functionality (but with reduced performance
of course). This is also hardware specific and was still not supported
on the first GPU's.

From: Sebastian Sylvan [mailto:[EMAIL PROTECTED] 
Sent: den 4 november 2008 20:06
To: [EMAIL PROTECTED]
Cc: Tobias Bexelius; haskell-cafe@haskell.org
Subject: Re: [Haskell-cafe] Memory efficiency questions for real-time
graphics

On Mon, Nov 3, 2008 at 3:45 PM, Svein Ove Aas <[EMAIL PROTECTED]> wrote:

On Mon, Nov 3, 2008 at 11:31 AM, Tobias Bexelius
<[EMAIL PROTECTED]> wrote:
> Before Direct3D 10, its too costly to read back the updated
vertex data
> in every frame, which force you to make this kind of
operations on the
> CPU.
> With D3D 10 however, you should use the new Stream-Output
stage which is
> used to return updated vertex data directly to a vertex buffer
on the
> GPU. So if you can afford a new graphics card and likes Vista,
that's
> the way to go :)
>

Or you could use OpenGL, which has supported that since the
first GPUs
that did were released.

I think that came with OpenGL 3.0. Unless you're counting
vendor-specific extensions... 

-- 
Sebastian Sylvan
+44(0)7857-300802
UIN: 44640862

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory efficiency questions for real-time graphics

2008-11-04 Thread Sebastian Sylvan

On Mon, Nov 3, 2008 at 3:45 PM, Svein Ove Aas <[EMAIL PROTECTED]> wrote:

> On Mon, Nov 3, 2008 at 11:31 AM, Tobias Bexelius
> <[EMAIL PROTECTED]> wrote:
> > Before Direct3D 10, its too costly to read back the updated vertex data
> > in every frame, which force you to make this kind of operations on the
> > CPU.
> > With D3D 10 however, you should use the new Stream-Output stage which is
> > used to return updated vertex data directly to a vertex buffer on the
> > GPU. So if you can afford a new graphics card and likes Vista, that's
> > the way to go :)
> >
> Or you could use OpenGL, which has supported that since the first GPUs
> that did were released.


I think that came with OpenGL 3.0. Unless you're counting vendor-specific
extensions...
-- 
Sebastian Sylvan
+44(0)7857-300802
UIN: 44640862
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory efficiency questions for real-time graphics

2008-11-03 Thread Svein Ove Aas

On Mon, Nov 3, 2008 at 11:31 AM, Tobias Bexelius
<[EMAIL PROTECTED]> wrote:
> Before Direct3D 10, its too costly to read back the updated vertex data
> in every frame, which force you to make this kind of operations on the
> CPU.
> With D3D 10 however, you should use the new Stream-Output stage which is
> used to return updated vertex data directly to a vertex buffer on the
> GPU. So if you can afford a new graphics card and likes Vista, that's
> the way to go :)
>
Or you could use OpenGL, which has supported that since the first GPUs
that did were released.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

RE: [Haskell-cafe] Memory efficiency questions for real-time graphics

2008-11-03 Thread Tobias Bexelius

Before Direct3D 10, its too costly to read back the updated vertex data
in every frame, which force you to make this kind of operations on the
CPU.
With D3D 10 however, you should use the new Stream-Output stage which is
used to return updated vertex data directly to a vertex buffer on the
GPU. So if you can afford a new graphics card and likes Vista, that's
the way to go :)

/Tobias

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of T Willingham
Sent: den 2 november 2008 20:11
To: haskell-cafe@haskell.org
Subject: Re: [Haskell-cafe] Memory efficiency questions for real-time
graphics

On Sat, Nov 1, 2008 at 2:11 PM, Sebastian Sylvan
<[EMAIL PROTECTED]> wrote:
> On Sat, Nov 1, 2008 at 6:57 PM, T Willingham 
> <[EMAIL PROTECTED]>
>>
>> The per-vertex computation is a quite complex time-dependent function

>> applied to the given domain on each update.  Yet even if it were 
>> simple, I would still first implement the formulas in Haskell and 
>> leave the optimization for later, if at all.
>
> I'd be very surprised to see a vertex transform that would be faster 
> to implement on the CPU than the GPU.

It appears you misunderstood "complex time-dependent function".  This is
quite a bit of Haskell code involving a significant amount of octonion
algebra.  It could, in principle, be put on the GPU, however that should
be the very last step after everything else works.

> There are various ways of writing out your vertex data too, so if it 
> doesn't change too often you can still do the transformation on the 
> GPU,

As I previously stated, it changes every frame.

Take a highly complicated function and apply it to N vertices.  Now
increase N until the framerate is affected.  That is where I am.  It is
obvious that any N-sized allocations will cause the framerate to
stutter.  This is not just theoretical: I *see* it as I increase N while
it's running.  This is the point of my initial email, the subject of
which is memory efficiency.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory efficiency questions for real-time graphics

2008-11-02 Thread T Willingham

On Sun, Nov 2, 2008 at 2:13 PM, Don Stewart <[EMAIL PROTECTED]> wrote:
> t.r.willingham:
>> Take a highly complicated function and apply it to N vertices.  Now
>> increase N until the framerate is affected.  That is where I am.  It
>> is obvious that any N-sized allocations will cause the framerate to
>> stutter.  This is not just theoretical: I *see* it as I increase N
>> while it's running.  This is the point of my initial email, the
>> subject of which is memory efficiency.
>
> I'm glad things are going well. Do you have the code somewhere for
> review?

If you read my initial post, you'll see that I am looking to convert
an existing C++ application to Haskell.  Above I am describing the
current C++ implementation.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory efficiency questions for real-time graphics

2008-11-02 Thread Don Stewart

t.r.willingham:
> On Sat, Nov 1, 2008 at 2:11 PM, Sebastian Sylvan
> <[EMAIL PROTECTED]> wrote:
> > On Sat, Nov 1, 2008 at 6:57 PM, T Willingham <[EMAIL PROTECTED]>
> >>
> >> The per-vertex computation is a quite complex time-dependent function
> >> applied to the given domain on each update.  Yet even if it were
> >> simple, I would still first implement the formulas in Haskell and
> >> leave the optimization for later, if at all.
> >
> > I'd be very surprised to see a vertex transform that would be faster to
> > implement on the CPU than the GPU.
> 
> It appears you misunderstood "complex time-dependent function".  This
> is quite a bit of Haskell code involving a significant amount of
> octonion algebra.  It could, in principle, be put on the GPU, however
> that should be the very last step after everything else works.
> 
> > There are various ways of writing out your vertex data too, so if it doesn't
> > change too often you can still do the transformation on the GPU,
> 
> As I previously stated, it changes every frame.
> 
> Take a highly complicated function and apply it to N vertices.  Now
> increase N until the framerate is affected.  That is where I am.  It
> is obvious that any N-sized allocations will cause the framerate to
> stutter.  This is not just theoretical: I *see* it as I increase N
> while it's running.  This is the point of my initial email, the
> subject of which is memory efficiency.

I'm glad things are going well. Do you have the code somewhere for
review?


-- Don
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory efficiency questions for real-time graphics

2008-11-02 Thread T Willingham

On Sat, Nov 1, 2008 at 2:11 PM, Sebastian Sylvan
<[EMAIL PROTECTED]> wrote:
> On Sat, Nov 1, 2008 at 6:57 PM, T Willingham <[EMAIL PROTECTED]>
>>
>> The per-vertex computation is a quite complex time-dependent function
>> applied to the given domain on each update.  Yet even if it were
>> simple, I would still first implement the formulas in Haskell and
>> leave the optimization for later, if at all.
>
> I'd be very surprised to see a vertex transform that would be faster to
> implement on the CPU than the GPU.

It appears you misunderstood "complex time-dependent function".  This
is quite a bit of Haskell code involving a significant amount of
octonion algebra.  It could, in principle, be put on the GPU, however
that should be the very last step after everything else works.

> There are various ways of writing out your vertex data too, so if it doesn't
> change too often you can still do the transformation on the GPU,

As I previously stated, it changes every frame.

Take a highly complicated function and apply it to N vertices.  Now
increase N until the framerate is affected.  That is where I am.  It
is obvious that any N-sized allocations will cause the framerate to
stutter.  This is not just theoretical: I *see* it as I increase N
while it's running.  This is the point of my initial email, the
subject of which is memory efficiency.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory efficiency questions for real-time graphics

2008-11-01 Thread T Willingham

On Tue, Oct 28, 2008 at 3:24 PM, Sebastian Sylvan
<[EMAIL PROTECTED]> wrote:
> 2008/10/28 T Willingham <[EMAIL PROTECTED]>
>>
>> To give a context for all of this, I am applying a non-linear
>> transformation to an object on every frame.  (Note: non-linear, so a
>> matrix transform will not suffice.)
>
> Any reason why you can not do this in the vertex shader? You really should
> avoid trying to touch the vertices with the CPU if at all possible.

The per-vertex computation is a quite complex time-dependent function
applied to the given domain on each update.  Yet even if it were
simple, I would still first implement the formulas in Haskell and
leave the optimization for later, if at all.  The current C++
implementation which uses userland-memory vertex arrays already
performs very well.

On Sat, Nov 1, 2008 at 3:15 AM, Neal Alexander <[EMAIL PROTECTED]> wrote:
> Even when generating one or more copies of "world" per frame the performance
> stays fine and allocations are minimal.

Who says?  That may be your particular experience from your particular
tests.  In my case, any copy of the "world" on each frame would have a
catastrophic effect on the framerate, for any such definition of
"world".

> From what ive seen, the OpenGL calls are whats going to bottle neck.

Yes, that may as well be a tautology.  The problem is sporadic lag and
jittering from occasional large allocations and/or garbage collection
from frequent small allocations.

It's a unique situation where even the best profiling cannot pinpoint
what is blatantly obvious to the human eye.  Though the profiler may
register it as 0.01%, the actual effect is glitchy behavior which
comes off as unprofessional.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory efficiency questions for real-time graphics

2008-10-28 Thread Sebastian Sylvan

2008/10/28 T Willingham <[EMAIL PROTECTED]>

>
> As my first Haskell exposure, I've been working through Real World
> Haskell.
>
> I am considering converting some of my C++ graphics libraries to
> Haskell.  I've done a fair amount of googling on the subject, however
> I haven't quite been able to find clear answers to some of following
> issues.
>
> (1) Using an OpenGL vertex array (a contiguous chunk of memory which
> is handed to the graphics card) is important.  I see the source code
> of Frag does this, so it looks like we're good.  Check.
>
> (2) In-place modification of the vertex array is important.  Every
> vertex changes on each frame update.  And we always want more
> vertices.
>
> (3) Growing and shrinking the vertex array efficiently is important.
> Here the behavior of C++ std::vector happens to be ideal.  When
> growing, extend the array in-place if possible (using reserved space
> if any), otherwise allocate a new chunk ("amortized constant time").
> When shrinking, do nothing but record the smaller size; the unused
> memory chunk is reserved for possible future growth.
>
> (4) Doing little to no allocations each frame update is important.  In
> the current C++ version, zero allocations occur during a normal frame
> update.  When the vertex array changes, that is the only allocation
> which happens.
>
> To give a context for all of this, I am applying a non-linear
> transformation to an object on every frame.  (Note: non-linear, so a
> matrix transform will not suffice.)
>

Any reason why you can not do this in the vertex shader? You really should
avoid trying to touch the vertices with the CPU if at all possible.

-- 
Sebastian Sylvan
+44(0)7857-300802
UIN: 44640862
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory efficiency questions for real-time graphics

2008-10-27 Thread Jefferson Heard

By the way, T, feel free to lean on me if you run into any problems.
I did something along the lines of what you were describing some time
ago, my particular non-linear transform being converting a vertex
array to/from polar coordinates and updating in realtime.

-- Jeff

On Tue, Oct 28, 2008 at 12:00 AM, T Willingham <[EMAIL PROTECTED]> wrote:
> On Mon, Oct 27, 2008 at 11:04 PM, Don Stewart <[EMAIL PROTECTED]> wrote:
>> It depends on the operations (safe indexing or unsafe indexing).
>> Being strict or unboxed doesn't determine the safety.
>
> OK, that makes sense.
>
> This is a huge load off my conscience.  I can now dig into Real World
> Haskell without hesitation, knowing that what I want in the end is
> actually possible.
>
> Thanks again,
> TW
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>



-- 
I try to take things like a crow; war and chaos don't always ruin a
picnic, they just mean you have to be careful what you swallow.

-- Jessica Edwards
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory efficiency questions for real-time graphics

2008-10-27 Thread T Willingham

On Mon, Oct 27, 2008 at 11:04 PM, Don Stewart <[EMAIL PROTECTED]> wrote:
> It depends on the operations (safe indexing or unsafe indexing).
> Being strict or unboxed doesn't determine the safety.

OK, that makes sense.

This is a huge load off my conscience.  I can now dig into Real World
Haskell without hesitation, knowing that what I want in the end is
actually possible.

Thanks again,
TW
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory efficiency questions for real-time graphics

2008-10-27 Thread Don Stewart

t.r.willingham:
> On Mon, Oct 27, 2008 at 10:07 PM, Don Stewart <[EMAIL PROTECTED]> wrote:
> >
> > Seems fine. You'll be working at a low level, with strict, mutable,
> > unboxed data structures, but that's fine: the machine loves them.
> >
> 
> Thanks for the quick reply.  One last question -- is it at all
> possible to segfault with strict, mutable, unboxed structures?  I
> don't quite understand how it knows not to overwrite or underwrite.

It depends on the operations (safe indexing or unsafe indexing).
Being strict or unboxed doesn't determine the safety.

So be careful if you have a 'Ptr a' or start using unsafeWrite.

-- Don

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory efficiency questions for real-time graphics

2008-10-27 Thread T Willingham

On Mon, Oct 27, 2008 at 10:07 PM, Don Stewart <[EMAIL PROTECTED]> wrote:
>
> Seems fine. You'll be working at a low level, with strict, mutable,
> unboxed data structures, but that's fine: the machine loves them.
>

Thanks for the quick reply.  One last question -- is it at all
possible to segfault with strict, mutable, unboxed structures?  I
don't quite understand how it knows not to overwrite or underwrite.

Cheers,
T. Willingham
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory efficiency questions for real-time graphics

2008-10-27 Thread Don Stewart

t.r.willingham:
>As my first Haskell exposure, I've been working through Real World
>Haskell.
> 
>I am considering converting some of my C++ graphics libraries to
>Haskell.  I've done a fair amount of googling on the subject, however
>I haven't quite been able to find clear answers to some of following
>issues.
> 
>(1) Using an OpenGL vertex array (a contiguous chunk of memory which
>is handed to the graphics card) is important.  I see the source code
>of Frag does this, so it looks like we're good.  Check.
> 
>(2) In-place modification of the vertex array is important.  Every
>vertex changes on each frame update.  And we always want more
>vertices.

I imagine these are mutable arrays, so that's fine.
  
>(3) Growing and shrinking the vertex array efficiently is important.
>Here the behavior of C++ std::vector happens to be ideal.  When
>growing, extend the array in-place if possible (using reserved space
>if any), otherwise allocate a new chunk ("amortized constant time").
>When shrinking, do nothing but record the smaller size; the unused
>memory chunk is reserved for possible future growth.

Easy enough. You're working with raw arrays of memory , I assume (e.g.
UArrays or Ptr a?)
  
>(4) Doing little to no allocations each frame update is important.  In
>the current C++ version, zero allocations occur during a normal frame
>update.  When the vertex array changes, that is the only allocation
>which happens.

This is certainly possible. To confirm it, look at the generated Core
code, produced by GHC (with the ghc-core tool).

See the realworldhaskell chapter on optimisation.
  
>To give a context for all of this, I am applying a non-linear
>transformation to an object on every frame.  (Note: non-linear, so a
>matrix transform will not suffice.)
> 
>The object is described by a certain function, and is generated from a
>3D domain grid.  The user can change the resolution of the grid on the
>fly, while the object is moving.  (Hence the need for grow/shrink
>efficiency.)
> 
>Given that (1) is out of the way, what's the best I expect from
>Haskell concerning (2)-(4)?

Seems fine. You'll be working at a low level, with strict, mutable,
unboxed data structures, but that's fine: the machine loves them.

-- Don
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Memory efficiency questions for real-time graphics

2008-10-27 Thread T Willingham

As my first Haskell exposure, I've been working through Real World
Haskell.

I am considering converting some of my C++ graphics libraries to
Haskell.  I've done a fair amount of googling on the subject, however
I haven't quite been able to find clear answers to some of following
issues.

(1) Using an OpenGL vertex array (a contiguous chunk of memory which
is handed to the graphics card) is important.  I see the source code
of Frag does this, so it looks like we're good.  Check.

(2) In-place modification of the vertex array is important.  Every
vertex changes on each frame update.  And we always want more
vertices.

(3) Growing and shrinking the vertex array efficiently is important.
Here the behavior of C++ std::vector happens to be ideal.  When
growing, extend the array in-place if possible (using reserved space
if any), otherwise allocate a new chunk ("amortized constant time").
When shrinking, do nothing but record the smaller size; the unused
memory chunk is reserved for possible future growth.

(4) Doing little to no allocations each frame update is important.  In
the current C++ version, zero allocations occur during a normal frame
update.  When the vertex array changes, that is the only allocation
which happens.

To give a context for all of this, I am applying a non-linear
transformation to an object on every frame.  (Note: non-linear, so a
matrix transform will not suffice.)

The object is described by a certain function, and is generated from a
3D domain grid.  The user can change the resolution of the grid on the
fly, while the object is moving.  (Hence the need for grow/shrink
efficiency.)

Given that (1) is out of the way, what's the best I expect from
Haskell concerning (2)-(4)?

Thanks,
T. Willingham
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory problems reading a IntMap from a binary file

2008-08-03 Thread Don Stewart

lutzsteens:
> Hi,
> 
> I have IntMap String with about 40,000 entries. After saving it to disk 
> (via Data.Binary) the file is 3.5 Mb small. However if I load it and 
> save it back again my program needs 180 MB memory. Is there anything I 
> do wrong or does the map really need that much memory?


String can be really expensive. Try an IntMap ByteString

> 
> The (simple) program I wrote:
> 
> main =
>do
>mp <- decodeFile "i.bin" :: IO ( IntMap String )
>encodeFile "i2.bin" mp
>exitWith ExitSuccess
> 
> Thanks,
> Ludger
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Memory problems reading a IntMap from a binary file

2008-08-03 Thread Ludger Steens


Hi,

I have IntMap String with about 40,000 entries. After saving it to disk 
(via Data.Binary) the file is 3.5 Mb small. However if I load it and 
save it back again my program needs 180 MB memory. Is there anything I 
do wrong or does the map really need that much memory?


The (simple) program I wrote:

main =
   do
   mp <- decodeFile "i.bin" :: IO ( IntMap String )
   encodeFile "i2.bin" mp
   exitWith ExitSuccess

Thanks,
Ludger
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory profiling

2008-06-16 Thread Chaddaï Fouché

2008/6/16 Pieter Laeremans <[EMAIL PROTECTED]>:
> Hi,
>
> Which tools do you recommand for memory profiling   haskell programs
> on a *nix system.
> I'm using haskell to develop a CGI program/script.
>
> The application has to be deployed on shared hosting infrastructure.
> Since I would like to be a good citizen ,
> I would need to meassure the maximum amount of memory allocated to the
> program during execution.
> The best I can do now is look at top but that 's not satisfactory.

Are you aware that by calling the program with the correct RTS options
you can have memory profiling ?
http://www.haskell.org/ghc/docs/latest/html/users_guide/runtime-control.html
(see option -s for simple statistics)
http://www.haskell.org/ghc/docs/latest/html/users_guide/prof-heap.html

-- 
Jedaï
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Memory profiling

2008-06-15 Thread Pieter Laeremans

Hi,

Which tools do you recommand for memory profiling   haskell programs
on a *nix system.
I'm using haskell to develop a CGI program/script.

The application has to be deployed on shared hosting infrastructure.
Since I would like to be a good citizen ,
I would need to meassure the maximum amount of memory allocated to the
program during execution.
The best I can do now is look at top but that 's not satisfactory.

thanks,

Pieter



-- 
Pieter Laeremans <[EMAIL PROTECTED]>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory allocation in extractor function (newbie question)

2008-04-07 Thread Alexander Kireyev

On Mon, Apr 7, 2008 at 4:16 PM, Yitzchak Gale <[EMAIL PROTECTED]> wrote:
>  You didn't show us the code for countForPoints. I'll bet you wrote
>  something like
>
>  countForPoints area ls count points =
>   sum $ map (countPathsFrom area (count + 1) ls) points
>
>  Unfortunately, the standard sum function is too lazy in many situations.
>  And if you wrote out the recursion for countForPoints by hand, you
>  may have run into the same problem. In either case, you'll be accumulating
>  huge amounts of unevaluated (+) thunks instead of adding up the
>  count as you go along.
>
>  If this is the problem, you can fix it by using foldl', a strict version
>  of foldl from Data.List, instead of sum:
>
>  countForPoints area ls count points =
>   foldl' (+) 0 $ map (countPathsFrom area (count + 1) ls) points

Thanks for the pointer. In fact my version was already using the fold
function, albeit the lazy version of it:

countForPoints :: Area -> String -> Int -> [Point] -> Int
countForPoints area goal count points =
   let newCount = foldl (countSubPaths area goal) count points
   in if countExceeded newCount
  then -1
  else newCount

countExceeded :: Int -> Bool
countExceeded count = (count < 0) || (count > countLimit)

countSubPaths :: Area -> String -> Int -> Point -> Int
countSubPaths area goal count point =
 if (countExceeded count)
 then -1
 else countPathsFrom area count goal point

Switching to the strict version reduced the amount of garbage in that
function and in countSubPaths, but the problem with 'letterAt' still
shines. I was thinking that adding strictness to that function might
help, but no amount of exclamation marks helped it so far.

-Alexander
// copying to the mailing list, because first reply didn't get through.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory allocation in extractor function (newbie question)

2008-04-07 Thread Yitzchak Gale

Alexander Kireyev wrote:
>  While trying to write a program for the countPaths Code Jam problem I
>  ran into what seems to me as a weird behaviour in terms of memory
>  allocation...
>  The profiling log (./paths +RTS -P) shows the following time/space
>  behaviour for them...

Hi Alexander,

I'm not sure I can fully explain your profiling behavior either, but
I'll take a stab at what the problem might be anyway.

You didn't show us the code for countForPoints. I'll bet you wrote
something like

countForPoints area ls count points =
  sum $ map (countPathsFrom area (count + 1) ls) points

Unfortunately, the standard sum function is too lazy in many situations.
And if you wrote out the recursion for countForPoints by hand, you
may have run into the same problem. In either case, you'll be accumulating
huge amounts of unevaluated (+) thunks instead of adding up the
count as you go along.

If this is the problem, you can fix it by using foldl', a strict version
of foldl from Data.List, instead of sum:

countForPoints area ls count points =
  foldl' (+) 0 $ map (countPathsFrom area (count + 1) ls) points

Hope this helps,
Yitz
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Memory allocation in extractor function (newbie question)

2008-04-07 Thread Alexander Kireyev

Hello,

While trying to write a program for the countPaths Code Jam problem I
ran into what seems to me as a weird behaviour in terms of memory
allocation.

The task is to count the number of way you can spell a certain "word"
by walking some path on a board of letters.

Being a newbie I started with a very straightforward recursive
approach, which is not effective, but still helps to improve
understanding of Haskell basics. Quite naturally I quickly hit the
point where the program becomes too slow, so I tried to profile and
optimize the code.

I used 3 data types:

data GridEntry = GridEntry {letter :: Char, neighbors :: [Point]}
data Area = Area {letters :: (Array (Int, Int) GridEntry), width ::
Int, height :: Int}
data Point = Point { x :: Int, y :: Int} deriving (Show, Eq)

With the recursive approach I had to retrieve GridEntries (a letter in
a certain point of grid together with the list of its neighbours on
the grid) a lot. Quite surprisingly I saw that this operation was
overly expensive in my program. The most time was spent in these 2
functions:

countPathsFrom :: Area -> Int -> String -> Point -> Int
countPathsFrom _ count [] _ = count
countPathsFrom area count (l:[]) point =
   count +
  if (letter (letterAt area point) == l)
  then 1
  else 0
countPathsFrom area count (l:ls) point =
  let gridPt = letterAt area point
  points = neighbors gridPt
  in
if (letter gridPt == l)
then countForPoints area ls count points
else count

letterAt :: Area -> Point -> GridEntry
letterAt area (Point x y) = letters area ! (x, y)

The profiling log (./paths +RTS -P) shows the following time/space
behaviour for them:

  countPathsFromMain
  18111830476  37.8   52.359.5   82.2 28 1325013312
   letterAt Main
  18411830476  21.6   29.921.6   29.9 16 757150464

What puzzles me here is how the function letterAt generates 64 bytes
of garbage per call. Since there are no side-effects in this section
of program - and noone can damage the 'area' structure - I would
expect the compiler/runtime to use a reference to the data in the
'area' structure when returning values from letterAt. Instead it seems
to make a copy of GridEntry and return that. The question here is
whether (why?) this behaviour is normal and if it is possible to
optimize the code in such cases.

Thanks in advance,
-Alexander
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory Leak Help

2007-11-11 Thread Bertram Felgenhauer

J. Garrett Morris wrote:
> Hello,
> 
> I have code which seems to contain a memory leak, but I'm not sure
> where it is or what's causing it.  Any help would be greatly
> appreciated:

I see no memory leak in your code, it "just" breaks the garbage
collector's heuristics by allocating an awful lot of data at once and
then using no more memory.

Try running your program with  +RTS -F1.1 -RTS. (The -F option controls
by how much the RTS increases the heap size when the heap gets exhausted.
The default is to double it, but for this program a modest increment is
much more sensible).

> data Ratings = Ratings { movieCount :: Int
>, movieLookup :: IOUArray Int Word32
>, movieRatings :: IOUArray Word32 Word32 }
[snip]
>   | otherwise = do writeArray movieRatings i rating-- ***

Btw, my first instinct was to suggest using ... $! rating  here, but with
unboxed arrays that's not necessary.

[snip]

Hope that helps,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Memory Leak Help

2007-11-11 Thread J. Garrett Morris

Hello,

I have code which seems to contain a memory leak, but I'm not sure
where it is or what's causing it.  Any help would be greatly
appreciated:

The code is:

data Ratings = Ratings { movieCount :: Int
   , movieLookup :: IOUArray Int Word32
   , movieRatings :: IOUArray Word32 Word32 }

readRatingsFromText :: Int -> Int -> C.ByteString -> IO Ratings
readRatingsFromText movieCount ratingCount text =
do movieLookup <- newArray_ (0, movieCount - 1)
   movieRatings <- newArray_ (0, fromIntegral ratingCount - 1)
   iter movieLookup movieRatings 0 (C.lines text)
   return (Ratings movieCount movieLookup movieRatings)
where iter :: IOUArray Int Word32 -> IOUArray Word32 Word32 ->
Word32 -> [C.ByteString] -> IO ()
  iter !movieLookup !movieRatings !i [] = return ()
  iter !movieLookup !movieRatings !i (s:ss)
  | c == ':' = do writeArray movieLookup movie i
  iter movieLookup movieRatings i ss
  | otherwise = do writeArray movieRatings i rating-- ***
   iter movieLookup movieRatings (i + 1) ss
  where Just (x, rest) = C.readInt s
c  = C.head rest
Just (y, _)= C.readInt (C.tail rest)
movie  = x - 1
rating = fromIntegral x `shiftL` 3 .|.
fromIntegral y :: Word32

main = do args <- getArgs
  ratings <- readRatingsFromText (read $ args !! 0) (read $
args !! 1) =<< B.readFile (args !! 2)
  return ()


This attempts to read a file that looks like:

1:
401,2
503,1
...
2:
506,5
230,4

possibly with additional junk at the end of each line into an array.
When I run this, memory usage grows without bound.

If I comment out the line with -- ***, the memory profile is flat.
Without storing the ratings, it's not fantastically useful; however,
it does demonstrate that it should be possible to iterate through the
file with a flat memory profile (which is my goal).

I'm compiling with ghc --make -O2

Any help would be greatly appreciated.  Thanks!

 /g

-- 
The man who'd introduced them didn't much like either of them, though
he acted as if he did, anxious as he was to preserve good relations at
all times. One never knew, after all, now did one now did one now did
one.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory-mapped arrays? (IArray interfaces, slices, and so on)

2007-11-08 Thread David Roundy

On Wed, Nov 07, 2007 at 10:10:16PM +, Jules Bean wrote:
> Joel Reymont wrote:
> >Is there such a thing as memory-mapped arrays in GHC?
> 
> In principle, there could be an IArray instance to memory-mapped files.
> 
> (There could also be a mutable version, but just the IArray version 
> would be useful).

The IArray instance would be unsafe, however, because the contents of the
file could change after you opened it, breaking referential transparency.
I don't know what all is possible with file open modes, but I don't think
you can guarantee that once you've opened a file it won't change (unless
you unlink it, and know that noone else has an opened file handle to it).
It may be that by opening it in write mode you could ensure that noone else
modifies it (although I don't think this would work e.g. on nfs), but then
you eliminate the very useful possibility of mmapping read-only files as
IArrays (e.g. to access /usr/share/dict/words).

So it seems reasonable that the mutable version would necesarily be
primary, with the IArray version accessible only by an unsafe operation.
-- 
David Roundy
Department of Physics
Oregon State University
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory-mapped arrays? (IArray interfaces, slices, and so on)

2007-11-07 Thread Stefan O'Rear

On Wed, Nov 07, 2007 at 10:10:16PM +, Jules Bean wrote:
> Joel Reymont wrote:
>> Is there such a thing as memory-mapped arrays in GHC?
>
> In principle, there could be an IArray instance to memory-mapped files.
>
> (There could also be a mutable version, but just the IArray version would 
> be useful).
>
> I noticed just the other day that there are some 'obvious' IArray 
> constructors missing. It ought, for example, be possible to build a new 
> IArray from an old from a subset of the elements; a dimensional slice going 
> from an (Int,Int,Int) indexed array to (Int,Int), or a stride taking 'one 
> element in three' along each axis, etc.
>
> Annoyingly, it doesn't seem to be straightforward to make your own 
> instances of IArray, since the important methods aren't exported.

They are, from the undocumented module Data.Array.Base.

Stefan


signature.asc
Description: Digital signature
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory-mapped arrays? (IArray interfaces, slices, and so on)

2007-11-07 Thread Jules Bean


Joel Reymont wrote:

Is there such a thing as memory-mapped arrays in GHC?


In principle, there could be an IArray instance to memory-mapped files.

(There could also be a mutable version, but just the IArray version 
would be useful).


I noticed just the other day that there are some 'obvious' IArray 
constructors missing. It ought, for example, be possible to build a new 
IArray from an old from a subset of the elements; a dimensional slice 
going from an (Int,Int,Int) indexed array to (Int,Int), or a stride 
taking 'one element in three' along each axis, etc.


Annoyingly, it doesn't seem to be straightforward to make your own 
instances of IArray, since the important methods aren't exported.


I think there is real scope for some expansion here.


Jules
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory-mapped arrays?

2007-11-07 Thread Don Stewart

joelr1:
> Is there such a thing as memory-mapped arrays in GHC?
> 
> I'm looking for something that would let me memory-map a file of  
> floats and access it as an array.
> 

There's a commented out mmapFile for ByteString in Data.ByteString's
source. Use that, and then extract the ForeignPtr from the resulting 
ByteString, and castPtr it to a Ptr CFloat, then you're in business.

-- Don
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Memory-mapped arrays?

2007-11-07 Thread Joel Reymont


Is there such a thing as memory-mapped arrays in GHC?

I'm looking for something that would let me memory-map a file of  
floats and access it as an array.


Thanks, Joel

--
http://wagerlabs.com





___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory leak or wrong use of Array ?

2007-09-14 Thread Brandon S. Allbery KF8NH



On Sep 14, 2007, at 21:35 , L.Guo wrote:


Thanks for your advice about thunk, though I do not understand *thunk*
very well. Is there any other discriptions about thunk ?


A "thunk" is, in general, a piece of code which represents a  
suspended or delayed action.  In Haskell, it represents a lazy  
computation:  Haskell will only evaluate the code if the value is  
actually needed, and even then only just enough to satisfy the  
immediate need (thus, the result of evaluating a thunk may be a  
value, or another thunk).


--
brandon s. allbery [solaris,freebsd,perl,pugs,haskell] [EMAIL PROTECTED]
system administrator [openafs,heimdal,too many hats] [EMAIL PROTECTED]
electrical and computer engineering, carnegie mellon universityKF8NH


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory leak or wrong use of Array ?

2007-09-14 Thread L.Guo

Hi Stuart.

Thanks for your advice about thunk, though I do not understand *thunk*
very well. Is there any other discriptions about thunk ?

I have tried the *seq* operation. When input is 10,000,000, the memory
still "leak", and there is still a "stack overflow".

I changed some mapM_ to sequence . map f, and tried to save some division.

The key functions now looks like this:

  proportions n = unsafePerformIO $ do
  arr <- newArray (2,n) (False,1/1) :: Fractional t => IO (IOArray Int 
(Bool,t))
  sequence_ $ map (sieve arr n) [2..n]
  factors <- getElems arr
  return . map (\(n,(b,f)) -> (f,n)) $ zip [2..n] factors
where
  sieve arr ubound p = do
  (b,o) <- readArray arr p
  if b then return () else
sequence_ . map (update arr (toRational p)) . takeWhile (<=ubound) 
$ iterate (+p) p
  update arr p i = do
  (_,o) <- readArray arr i
  --writeArray arr i (True,o*(p-1)/p)
  let val = o * p / (p-1)
  val `seq` return () -- force the thunk
  writeArray arr i (True, val)
  solutionOf = snd . minimum
 . filter (\(f,n) -> isPerm (floor $ toRational n/f) n) . 
proportions

--   
L.Guo
2007-09-15


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory leak or wrong use of Array ?

2007-09-13 Thread Stuart Cook

On 9/14/07, L.Guo <[EMAIL PROTECTED]> wrote:
> Hi MailList Haskell-Cafe:
>
> I am tring to solve Project Euler problem 70.
> And write some code. (will at the end of this mail)
> And, I run the code in GHCi.
>
> The problem is that, when the input is 1,000,000, it works
> fine, when the input is up to 10,000,000, the memory GHCi
> used increase very fast and did not stop.
>
> Is this a memory leak ? or, is there some mis-understand
> about array ?

Haskell's boxed arrays are non-strict, which can cause space leaks
when you don't actually need the laziness. The usual quick-&-dirty
solution is to switch to unboxed arrays (e.g. IOUArray), which are
inherently strict, but they're only available for certain primitive
element types.

The solution, then, is to make sure you adequately force your thunks
before storing them in an array.

> update arr p i = do
> (_,o) <- readArray arr i
> writeArray arr i (True,o*(p-1)/p)

Here's a possible fix, though it's untested, and I'm sure there are
more elegant ways to write it:

  update arr p i = do
  (_, o) <- readArray arr i
  let val = o * (p-1) / p
  val `seq` return () -- force the thunk
  writeArray arr i (True, val)

This ensures that the second component of the pair is a value, rather
than a thunk.

Strictifying data such as numbers and booleans is relatively easy, but
things get tricky when you try to force more complicated structures
(deepSeq, anyone?).

Stuart
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Memory leak or wrong use of Array ?

2007-09-13 Thread L.Guo

Hi MailList Haskell-Cafe:

I am tring to solve Project Euler problem 70.
And write some code. (will at the end of this mail)
And, I run the code in GHCi.

The problem is that, when the input is 1,000,000, it works 
fine, when the input is up to 10,000,000, the memory GHCi 
used increase very fast and did not stop.

Is this a memory leak ? or, is there some mis-understand 
about array ?

Regards
--
-- Mudules :
import Data.Array.IO
import Foreign ( unsafePerformIO )
-- Codes :
p070_solve = putStrLn . show $ solutionOf 1000
  where
isPerm a b = sort (show a) == sort (show b)
phis n = unsafePerformIO $ do
arr <- newArray (2,n) (False,1/1) :: Fractional t => IO (IOArray Int 
(Bool,t))
mapM_ (sieve arr n) [2..n]
factors <- getElems arr
return . map (\(n,(b,f)) -> (n,floor $ toRational n*f)) $ zip [2..n] 
factors
  where
sieve arr ubound p = do
(b,o) <- readArray arr p
if b then return () else
  mapM_ (update arr (toRational p)) . takeWhile (<=ubound) $ 
iterate (+p) p
update arr p i = do
(_,o) <- readArray arr i
writeArray arr i (True,o*(p-1)/p)
solutionOf = snd . minimum
   . map (\(n,phi)->(toRational n / toRational phi,n))
   . filter (uncurry isPerm) . phis
--
L.Guo
2007-09-14

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Memory profiler

2007-07-27 Thread Stefan O'Rear

On Sat, Jul 28, 2007 at 12:11:31AM +0100, Jon Harrop wrote:
> Is there a memory profiler for Haskell?

Yes.  GHC, NHC and HBC all have integrated heap profilers.

ghc --make -prof -auto-all ...
./MyProgram +RTS -hc -RTS
./MyProgram +RTS -hm -RTS
./MyProgram +RTS -hd -RTS
./MyProgram +RTS -hy -RTS
...

http://haskell.org/ghc/dist/current/docs/users_guide/prof-heap.html

Stefan


signature.asc
Description: Digital signature
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Memory profiler

2007-07-27 Thread Jon Harrop


Is there a memory profiler for Haskell?

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
OCaml for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists/?e
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] memory usage in repeated reading of an external program's output

2007-06-21 Thread Andrea Rossato

On Thu, Jun 21, 2007 at 01:36:16PM -0700, Bryan O'Sullivan wrote:
>  Andrea Rossato wrote:
> 
> > Still I do not understand you reference to the leak problem. Could you
> > please elaborate a bit?
> 
>  The runProcess function returns a ProcessHandle.  If you don't call 
>  waitForProcess on that handle, you'll leak those handles.  On Unix-like 
>  systems, this means you'll accumulate zombie processes and potentially fill 
>  your process table, DoSing your machine.
> 

ahhh, yes, I found out the hard way. By the way, in the code I'm
writing I was waiting for the exit code of the process. I forgot to
copy it in the first example, and when I run it...;-)

as you can see, the second version already corrected the problem.

Thanks for your kind attention.

All the best.
andrea  
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

1 2 >

1 - 100 of 133 matches

Mail list logo