On May 8, 2006, at 9:47 AM, Thomas Lumley wrote: > On Fri, 5 May 2006, Robert Citek wrote: >> Reloading the 10 MM dataset: >> >> R > foo <- read.delim("dataset.010MM.txt") >> >> R > object.size(foo) >> [1] 440000376 >> >> R > gc() >> used (Mb) gc trigger (Mb) max used (Mb) >> Ncells 10183941 272.0 15023450 401.2 10194267 272.3 >> Vcells 20073146 153.2 53554505 408.6 50086180 382.2 >> >> Combined, Ncells or Vcells appear to take up about 700 MB of RAM, >> which is about 25% of the 3 GB available under Linux on 32-bit >> architecture. Also, removing foo seemed to free up "used" memory, >> but didn't change the "max used": > > No, that's what "max" means. You need gc(reset=TRUE) to reset the > max.
Yup, that worked (see below). The example from ?gc wasn't that clear to me. Thanks for clarifying. I also found it informative to compare loading data into a data.frame vs a vector. $ cat <<eof | R -q --no-save gc() foo <- read.delim("dataset.010MM.txt") gc() rm(foo) gc() gc(reset=TRUE) eof R > gc() used (Mb) gc trigger (Mb) max used (Mb) Ncells 177865 4.8 407500 10.9 350000 9.4 Vcells 72114 0.6 786432 6.0 333941 2.6 R > foo <- read.delim("dataset.010MM.txt") R > gc() used (Mb) gc trigger (Mb) max used (Mb) Ncells 10179849 271.9 15023450 401.2 10180159 271.9 Vcells 20072448 153.2 47764583 364.5 46849682 357.5 R > rm(foo) R > gc() used (Mb) gc trigger (Mb) max used (Mb) Ncells 179910 4.9 12018759 321.0 10181187 271.9 Vcells 72458 0.6 38211666 291.6 46849682 357.5 R > gc(reset=TRUE) used (Mb) gc trigger (Mb) max used (Mb) Ncells 179920 4.9 9615007 256.8 179920 4.9 Vcells 72482 0.6 30569332 233.3 72482 0.6 $ cat <<eof | R -q --no-save gc() foo <- scan("dataset.010MM.txt") gc() rm(foo) gc() gc(reset=TRUE) eof R > gc() used (Mb) gc trigger (Mb) max used (Mb) Ncells 177865 4.8 407500 10.9 350000 9.4 Vcells 72114 0.6 786432 6.0 333941 2.6 R > foo <- scan("dataset.010MM.txt") Read 10000000 items R > gc() used (Mb) gc trigger (Mb) max used (Mb) Ncells 178230 4.8 407500 10.9 350000 9.4 Vcells 10072185 76.9 26713872 203.9 26456224 201.9 R > rm(foo) R > gc() used (Mb) gc trigger (Mb) max used (Mb) Ncells 178286 4.8 407500 10.9 350000 9.4 Vcells 72190 0.6 21371097 163.1 26456224 201.9 R > gc(reset=TRUE) used (Mb) gc trigger (Mb) max used (Mb) Ncells 178296 4.8 407500 10.9 178296 4.8 Vcells 72214 0.6 17096877 130.5 72214 0.6 Regards, - Robert http://www.cwelug.org/downloads Help others get OpenSource software. Distribute FLOSS for Windows, Linux, *BSD, and MacOS X with BitTorrent ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html