Re: [Rd] Why does duplicate() make deep copies?

2013-09-05 Thread Simon Urbanek
On Sep 5, 2013, at 12:31 AM, Peter Meilstrup wrote:

 Some experimentation with the below function should convince you that the
 runtime of the bit inside sys.time is proportional to size*number*times. I
 think it should only be proportional to number*times. The function is only
 manipulating a list of references to vectors and not trying to make changes
 to the vectors themselves.
 
 overcopying - function(size, number, times) {
  #Make a list of NUMBER vectors of SIZE, then reorder the list.  The
  #vectors themselves are never touched, only their references are
  #moved around.  If R implements copy on write correctly the
  #elapsed time should be ~number*times.
  L - replicate(number, list(vector(numeric, size)), simplify=FALSE)
  system.time(for (i in 1:times) {
L[sample(number)] - L
  })
 }
 
 I see that duplicate.c makes a recursive copy of each element when it
 encounters a VECSXP or a LISTSXP, which it seems there should be no need
 for (it should be sufficient to make a shallow copy and ensure NAMED is set
 on the elements.)
 
 Why is R making apparently unnecessary deep copies?
 

Because ref counting is not recursive. Changing that breaks a quite a few 
packages that abuse the fact that copies are currently deep. Main problem with 
that is that it's pretty much impossible to detect it ahead of time. That said, 
there are efforts underway by Luke Tierney and Michael Lawrence to see how far 
we can go without breaking everything.

Cheers,
Simon

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Why does duplicate() make deep copies?

2013-09-04 Thread Peter Meilstrup
Some experimentation with the below function should convince you that the
runtime of the bit inside sys.time is proportional to size*number*times. I
think it should only be proportional to number*times. The function is only
manipulating a list of references to vectors and not trying to make changes
to the vectors themselves.

overcopying - function(size, number, times) {
  #Make a list of NUMBER vectors of SIZE, then reorder the list.  The
  #vectors themselves are never touched, only their references are
  #moved around.  If R implements copy on write correctly the
  #elapsed time should be ~number*times.
  L - replicate(number, list(vector(numeric, size)), simplify=FALSE)
  system.time(for (i in 1:times) {
L[sample(number)] - L
  })
}

I see that duplicate.c makes a recursive copy of each element when it
encounters a VECSXP or a LISTSXP, which it seems there should be no need
for (it should be sufficient to make a shallow copy and ensure NAMED is set
on the elements.)

Why is R making apparently unnecessary deep copies?

Peter

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel