Please try memory.limit() to confirm how much system memory is available to R.
Additionally, read.delim returns a data.frame. You could use the colClasses argument to change variable types (see example below) or use scan() which returns a vector. This would store the data more compactly. The vector object is significantly smaller than the data.frame. It appears from your example session that you are examining a single variable. If so, a vector would suffice. Note in the example below, processing large numbers in the integer type creates an under/over flow error. ====================Begin Session==================================== > #create vector > foovector<-scan(file="temp.txt") Read 2490368 items > > #create data.frame > foo<-read.delim(file="temp.txt",row.names=NULL,header=FALSE,colClasses=as.vector(c("numeric"))) > attributes(foo)$names<-"myfoo" > > foo2<-read.delim(file="temp.txt",row.names=NULL,header=FALSE,colClasses=as.vector(c("integer"))) > attributes(foo2)$names<-"myfoo" > > #vector from data.frame > tmpfoo<-foo$myfoo > > #check size > object.size(foo) [1] 119538076 > object.size(foo2) [1] 109576604 > object.size(foovector) [1] 19922972 > object.size(tmpfoo) [1] 19922972 > > #check sums > sum(tmpfoo) [1] 2.498528e+13 > sum(foo$myfoo) [1] 2.498528e+13 > sum(foo2$myfoo) [1] NA Warning message: Integer overflow in sum(.); use sum(as.numeric(.)) > sum(foovector) [1] 2.498528e+13 > > #show type > class(foo2$myfoo) [1] "integer" > class(foo$myfoo) [1] "numeric" > class(tmpfoo) [1] "numeric" > class(foovector) [1] "numeric" ====================End Session==================================== ----- Original Message ----- From: "Robert Citek" <[EMAIL PROTECTED]> To: <r-help@stat.math.ethz.ch> Sent: Friday, May 05, 2006 3:15 PM Subject: Re: [R] large data set, error: cannot allocate vector > > On May 5, 2006, at 11:30 AM, Thomas Lumley wrote: >> In addition to Uwe's message it is worth pointing out that gc() >> reports >> the maximum memory that your program has used (the rightmost two >> columns). >> You will probably see that this is large. > > Reloading the 10 MM dataset: > > R > foo <- read.delim("dataset.010MM.txt") > > R > object.size(foo) > [1] 440000376 > > R > gc() > used (Mb) gc trigger (Mb) max used (Mb) > Ncells 10183941 272.0 15023450 401.2 10194267 272.3 > Vcells 20073146 153.2 53554505 408.6 50086180 382.2 > > Combined, Ncells or Vcells appear to take up about 700 MB of RAM, > which is about 25% of the 3 GB available under Linux on 32-bit > architecture. Also, removing foo seemed to free up "used" memory, > but didn't change the "max used": > > R > rm(foo) > > R > gc() > used (Mb) gc trigger (Mb) max used (Mb) > Ncells 186694 5.0 12018759 321.0 10194457 272.3 > Vcells 74095 0.6 44173915 337.1 50085563 382.2 > > Regards, > - Robert > http://www.cwelug.org/downloads > Help others get OpenSource software. Distribute FLOSS > for Windows, Linux, *BSD, and MacOS X with BitTorrent > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html