I am using environments to avoid making copies (by keeping references).
But it seems like there is a hidden copy going on somewhere - for
example in the code fragment below, I am creating a reference to "y"
(of size 500MB) and storing the reference in object "data". But when I
save "data" and the
I have a data set of roughly 700MB which during processing grows up to
2G ( I'm using a 4G linux box). After the work is done I clean up (rm())
and the state is returned to 700MB. Yet I find I cannot run the same
routine again as it claims to not be able to allocate memory even though
gcinfo()
I'm trying to read in datasets with roughly 150,000 rows and 600
features. I wrote a function using scan() to read it in (I have a 4GB
linux machine) and it works like a charm. Unfortunately, converting the
scanned list into a datafame using as.data.frame() causes the memory
usage to explode (it c