Claus Reinke ha scritto:
    IntMap (UArr (Word16 :*: Word8))

I was adding elements to the map using something like:

    v = map singletonU (a :*: b)

    insertWith appendU k v m

However doing this eats a *lot* of memory.

Since 'insertWith' doesn't actually do the 'appendU', the appends
will also be compressed in time, at point of use, rather than spread
out in time, over construction. Which might be inconvenient if large
volumes of data are involved.


Ah, ok; that may explain the problem, but I'm still not sure.


With this program:
http://hpaste.org/fastcgi/hpaste.fcgi/view?id=3063

Memory usage seems ok.
955 MB, against 622 MB when ratings are grouped by movies
(  using [(MovieID, UArr (Word32 :*: Word8))]  )


The version using `insertWith appendU` is here:
http://hpaste.org/fastcgi/hpaste.fcgi/view?id=3063#a3065


I still can not explain so much difference in memory usage.

So the question is: why appending an array of only one element to an existing array causes memory problems?

It must copy the entire array.

And doing this repeatedly, with arrays of increasing length, would
not be a good idea.

I can't really see other solutions.

For single-threaded use, one might use fusion
to avoid the construction/recopying of the intermediate arrays.

Fusion is not possible, in my case, IMHO.
I'm parsing movie ratings, where we have customers ratings for each movie in separate files (17770 total movies).

Each file has format
<movie1>:
<customer1>,<rating1>,<date1>
<customer2>,<rating2>,<date2>
...
<movie2>:
<customer3>,<rating3>,<date3>
...


I want to group ratings by customers, instead of grouping them by movies.
Stream fusion is only possible in the latter case (but in this case I don't even need an IntMap).


I'm missing something?

> [...]


Manlio
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to