Tal, For your first example, x is not duplicated in memory. If you compile R with --enable-memory-profiling, you have access to the tracemem() function, which will report whether x is duplicate()d:
> x <- rep(1,100) > tracemem(x) [1] "<0x8f71c38>" > x[10] <- NA This does not result in duplication of x, nor does assignment of x to y: > y <- x At this point, y internally references x. It's not until we modify y, that x is duplicated, and y gets its own copy of the data: > y[10] <- NA tracemem[0x8f71c38 -> 0x91fff70]: Likewise, no duplication occurs using `[<-`: > x <- rep(1,100) > tracemem(x) [1] "<0x8e44900>" > x <- `[<-`(x, list=10, values=NA) But, R is not yet smart enough to avoid a duplication here: > x <- rep(1,100) > tracemem(x) [1] "<0x915d580>" > x <- replace(x, list=10, values=NA) tracemem[0x915d580 -> 0x915e090]: replace Beyond these simple tests, it's difficult to know when R copies memory. I mentioned in another post recently that subsetting a vector will copy memory, but this is not reported by tracemem(). For example: > tracemem(x) [1] "<0x915ed50>" > y <- x[1:100] > tracemem(y) [1] "<0x915f3f0>" > identical(x,y) [1] TRUE Fortunately, memory is fairly cheap, and memory operations are pretty fast in modern operating systems, like GNU Linux. I mostly find that the rate limiting steps in my code are computational routines, like exp(). -Matt On Wed, 2010-09-01 at 11:09 -0400, Tal Galili wrote: > Hello all, > > A friend recently brought to my attention that vector assignment actually > recreates the entire vector on which the assignment is performed. > > So for example, the code: > x[10]<- NA # The original call (short version) > > Is really doing this: > x<- replace(x, list=10, values=NA) # The original call (long version) > # assigning a whole new vector to x > > Which is actually doing this: > x<- `[<-`(x, list=10, values=NA) # The actual call > > > Assuming this can be explained reasonably to the lay man, my question is, > why is it done this way ? > Why won't it just change the relevant pointer in memory? > > On small vectors it makes no difference. > But on big vectors this might be (so I suspect) costly (in terms of time). > > > I'm curious for your responses on the subject. > > Best, > Tal > > > > ----------------Contact > Details:------------------------------------------------------- > Contact me: tal.gal...@gmail.com | 972-52-7275845 > Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | > www.r-statistics.com (English) > ---------------------------------------------------------------------------------------------- > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.