Re: [Haskell-cafe] memory-efficient data type for Netflix data - UArray Int Int vs UArray Int Word8

Manlio Perillo Thu, 26 Feb 2009 13:15:28 -0800

Kenneth Hoste ha scritto:

[...]
However, as I posted yesterday, I've been able to circumvent the issueby rethinking my data type, i.e. usingthe ~18K movie IDs as key instead of the 480K user IDs, which radicallylimits the overhead...

Well, but what if you really need the original data structure, forbetter data processing?

That way, I'm able to fit the data set in <700M of memory, withouthaving to reorganize the raw data.
The uvector package implements a vector of unboxed types, and has ansnocU operation, to append an element to the array.
I don't know how efficient it is, however.
By the way, about uvector: it has a Stream data type, and you canbuild a vector from a stream.
Thanks for letting me know, I'll keep this in mind.


Let me know if there are performance improvements.

Arrays are one of the few things I dislike in Haskell, and all theavailable array/vector packages cause me some confusion.





Regards   Manlio
_______________________________________________
Haskell-Cafe mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] memory-efficient data type for Netflix data - UArray Int Int vs UArray Int Word8

Reply via email to