Hi all, I've been trying to learn more about memory profiling in R and I've been trying memory profiling out on read.table. I'm getting a bit of a strange result, and I hope that someone might be able to explain why.
After running Rprof("read-table.prof", memory.profiling = TRUE, line.profiling = TRUE, gc.profiling = TRUE, interval = interval) diamonds <- read.table("diamonds.csv", sep = ",", header = TRUE) Rprof(NULL) and doing an lot of data manipulation, I end up with a table that displays the total memory (in megabytes) allocated and released (by gc) from each line of (a local copy of) read.table: file line alloc release 1 read-table.r 122 1.9797 1.1435 2 read-table.r 165 1.1148 0.6511 3 read-table.r 221 0.0763 0.0321 4 read-table.r 222 0.4922 1.5057 Lines 122 and 165 are where I expect to see big allocations and releases - they're calling scan and convert.type respectively. Lines 221 and 222 are more of a mystery: class(data) <- "data.frame" attr(data, "row.names") <- row.names Why do those lines need any allocations? I thought class<- and attr<- were primitives, and hence would modify in place. Re-running with gctorture(TRUE) yields roughly similar numbers, although there is no memory release because gc is called earlier, and the assignment of allocations to line is probably more accurate given that gctorture runs the code about 20x slower: file line alloc release 25 read-table.r 221 0.387299 0.00e+00 26 read-table.r 222 0.362964 0.00e+00 The whole object, when loaded, is ~4 meg, so those allocations represent fairly sizeable chunks of the total. Any suggestions would be greatly appreciated. Thanks! Hadley -- Chief Scientist, RStudio http://had.co.nz/ ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel