Thanks for the input, but it looks like I found a simple solution.
Turns out that if you assign to lists by name, then R doesn't make extra
copies:
> x<-double(10^9)
> mylist<-list()
> system.time(mylist[[1]]<-x)
user system elapsed
2.992 3.352 6.364
> x<-double(10^9)
> mylist<-list()
> system.time(mylist$x<-x)
user system elapsed
0 0 0
This is on R version 3.0.1.
On 08/16/2013 10:37 PM, David Winsemius wrote:
On Aug 16, 2013, at 2:23 PM, Gang Peng wrote:
>If you don't want to copy the data, you can use environments. You can first
>define x and y in the global environment and then in the function, use
>function get() to get x, y in the global environment. When you change x and
>y in the function, x and y also change in the global environment.
>
That doesn't sound like the behavior I expect in R. Do you care to illustrate
this?
-- David.
>Best,
>Gang
>
>
>2013/8/16 MRipley<mrip...@gmail.com>
>
>>Usually R is pretty good about not copying objects when it doesn't need
>>to. However, the list() function seems to make unnecessary copies. For
>>example:
>>
>>>system.time(x<-double(10^9))
>> user system elapsed
>> 1.772 4.280 7.017
>>>system.time(y<-double(10^9))
>> user system elapsed
>> 2.564 3.368 5.943
>>>system.time(z<-list(x,y))
>> user system elapsed
>> 5.520 6.748 12.304
>>
>>I have a function where I create two large arrays, manipulate them in
>>certain ways, and then return both as a list. I'm optimizing the function,
>>so I'd like to be able to build the return list quickly. The two large
>>arrays drop out of scope immediately after I make the list and return it,
>>so copying them is completely unnecessary.
>>
>>Is there some way to do this? I'm not familiar with manipulating lists
>>through the .Call interface, and haven't been able to find much about this
>>in the documentation. Might it be possible to write a fast (but possibly
>>unsafe) list function using .Call that doesn't make copies of the arguments?
>>
>>PS A few things I've tried. First, this is not due to triggering garbage
>>collection -- even if I call gc() before list(x,y), it still takes a long
>>time.
>>
>>Also, I've tried rewriting the function by creating the list at the
>>beginning as in:
>>result <- list(x=double(10^9),y=double(**10^9))
>>and then manipulating result$x and result$y but this made my code run
>>slower, as R seemed to be making other unnecessary copies while
>>manipulating elements of a list like this.
>>
>>I've considered (though not implemented) creating an environment rather
>>than a list, and returning the environment, but I'd rather find a simple
>>way of creating a list without making copies if possible.
>>
>>______________________________**________________
>>R-help@r-project.org mailing list
>>https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
>>PLEASE do read the posting guidehttp://www.R-project.org/**
>>posting-guide.html<http://www.R-project.org/posting-guide.html>
>>and provide commented, minimal, self-contained, reproducible code.
>>
>
> [[alternative HTML version deleted]]
>
>______________________________________________
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
David Winsemius
Alameda, CA, USA
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.