I can store a 100,000,000 records in about the same space on WinXP, with --max-mem-size set to 1700M. I have also successfully stored larger objects.
Like you there's not enough space to process a summary, but I've only got 2GB of RAM. I've successfully allocated more RAM to R on my Linux box (it has 4GB RAM) and processed larger objects. Have you tried playing w/ the memory settings? My results are below. -jason > tmp<-100000000:200000000 > length(tmp)/1000000 [1] 100 > gc() used (Mb) gc trigger (Mb) max used (Mb) Ncells 172832 4.7 350000 9.4 350000 9.4 Vcells 50063180 382.0 120448825 919.0 150074853 1145.0 > object.size(tmp)/length(tmp) [1] 4 > object.size(tmp) [1] 4e+08 > print(object.size(tmp)/1024^2,digits=15) [1] 381.469760894775 > summary(tmp) Error: cannot allocate vector of size 390625 Kb ----- Original Message ----- From: "Robert Citek" <[EMAIL PROTECTED]> To: <r-help@stat.math.ethz.ch> Sent: Friday, May 05, 2006 9:30 AM Subject: Re: [R] large data set, error: cannot allocate vector > > Oops. I was off by an order of magnitude. I meant 10^7 and 10^8 > rows of data for the first and second data sets, respectively. > > On May 5, 2006, at 10:24 AM, Robert Citek wrote: >> R > foo <- read.delim("dataset.010MM.txt") >> >> R > summary(foo) >> X15623 >> Min. : 1 >> 1st Qu.: 8152 >> Median :16459 >> Mean :16408 >> 3rd Qu.:24618 >> Max. :32766 > > Reloaded the 10MM set and ran an object.size: > > R > object.size(foo) > [1] 440000376 > > So, 10 MM numbers in about 440 MB. (Are my units correct?) That > would explain why 10 MM numbers does work while 100 MM numbers won't > work (4 GB limit on 32-bit machine). If my units are correct, then > each value would be taking up 4-bytes, which sounds right for a 4- > byte word (8 bits/byte * 4-bytes = 32-bits.) > > From Googling the archives, the solution that I've seen for working > with large data sets seems to be moving to a 64-bit architecture. > Short of that, are there any other generic workarounds, perhaps using > a RDBMS or a CRAN package that enables working with arbitrarily large > data sets? > > Regards, > - Robert > http://www.cwelug.org/downloads > Help others get OpenSource software. Distribute FLOSS > for Windows, Linux, *BSD, and MacOS X with BitTorrent > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html