Daniel Peebles ha scritto:
I have added UAProd-based Binary instances to my copy of the uvector
repo at http://patch-tag.com/publicrepos/pumpkin-uvector .


I can confirm that it works for me.

However I have now a memory problem with data decoding.

I need to serialize the Netflix Prize training dataset.
When I parse the data from original data set, memory usage is about 640 MB [1].

But when I load the data serialized and compressed, (as [UArr (Word32 *:* Word8)]), memory usage is about 840 MB...

The culprit is probably the decoding of the list (17770 elements).



[1] I have written a script in Python that parse the data, and it only
    takes 491 MB
    (using a list of a tuple with two compact arrays from numpy).
    So, GHC has memory problems here.




Thanks  Manlio Perillo
_______________________________________________
Haskell-Cafe mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to