Thanks for the input, but it looks like I found a simple solution. Turns out that if you assign to lists by name, then R doesn't make extra copies:

> x<-double(10^9)
> mylist<-list()
> system.time(mylist[[1]]<-x)
   user  system elapsed
  2.992   3.352   6.364

> x<-double(10^9)
> mylist<-list()
> system.time(mylist$x<-x)
   user  system elapsed
      0       0       0

This is on R version 3.0.1.


On 08/16/2013 10:37 PM, David Winsemius wrote:
On Aug 16, 2013, at 2:23 PM, Gang Peng wrote:

>If you don't want to copy the data, you can use environments. You can first
>define x and y in the global environment and then in the function, use
>function get() to get x, y in the global environment. When you change x and
>y in the function, x and y also change in the global environment.
>
That doesn't sound like the behavior I expect in R. Do you care to illustrate 
this?

-- David.
>Best,
>Gang
>
>
>2013/8/16 MRipley<mrip...@gmail.com>
>
>>Usually R is pretty good about not copying objects when it doesn't need
>>to.  However, the list() function seems to make unnecessary copies.  For
>>example:
>>
>>>system.time(x<-double(10^9))
>>   user  system elapsed
>>  1.772   4.280   7.017
>>>system.time(y<-double(10^9))
>>   user  system elapsed
>>  2.564   3.368   5.943
>>>system.time(z<-list(x,y))
>>   user  system elapsed
>>  5.520   6.748  12.304
>>
>>I have a function where I create two large arrays, manipulate them in
>>certain ways, and then return both as a list.  I'm optimizing the function,
>>so I'd like to be able to build the return list quickly.  The two large
>>arrays drop out of scope immediately after I make the list and return it,
>>so copying them is completely unnecessary.
>>
>>Is there some way to do this?  I'm not familiar with manipulating lists
>>through the .Call interface, and haven't been able to find much about this
>>in the documentation.  Might it be possible to write a fast (but possibly
>>unsafe) list function using .Call that doesn't make copies of the arguments?
>>
>>PS A few things I've tried.  First, this is not due to triggering garbage
>>collection -- even if I call gc() before list(x,y), it still takes a long
>>time.
>>
>>Also, I've tried rewriting the function by creating the list at the
>>beginning as in:
>>result <- list(x=double(10^9),y=double(**10^9))
>>and then manipulating result$x and result$y but this made my code run
>>slower, as R seemed to be making other unnecessary copies while
>>manipulating elements of a list like this.
>>
>>I've considered (though not implemented) creating an environment rather
>>than a list, and returning the environment, but I'd rather find a simple
>>way of creating a list without making copies if possible.
>>
>>______________________________**________________
>>R-help@r-project.org  mailing list
>>https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
>>PLEASE do read the posting guidehttp://www.R-project.org/**
>>posting-guide.html<http://www.R-project.org/posting-guide.html>
>>and provide commented, minimal, self-contained, reproducible code.
>>
>
>    [[alternative HTML version deleted]]
>
>______________________________________________
>R-help@r-project.org  mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
David Winsemius
Alameda, CA, USA


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to