Spencer, There have been a lot of discussions on these boards re working with large datasets in R, so looking through those will probably inform you better than I'll be able to. So with that said...
I have been trying to work with very large datasets as well (genetic datasets... maybe we're in the same boat?). The long and short of it = it is a challenge in R. Doable, but a challenge. First, I'm guessing you're using 64 bit computing and a 64 bit version of R, right? (I'm sure you're aware that you're capped by how much RAM you can access with 32 bit computing). The error message is telling you that R cannot find a contiguous bit of RAM that is that large enough for whatever object it was trying to manipulate right before it crashed. The total space taken up by your session was certainly much greater than this. How to avoid this problem? Short of reworking R to be more memory efficient, you can buy more RAM, use a library designed to store objects on hard drives rather than RAM (ff, filehash, or R.huge - I personally have had best of luck with the latter), or use a library designed to perform linear regression by using sparse matrices such as t(X)*X rather than X (big.lm - haven't used this yet). I also have yet to delve into the RSqlite library, which allows an interface between R and the SQLite database system (thus, you only bring in the portion of the database you need to work with). If you're unwilling to do any of the above, the final option is to read in only the part of the matrix you need, work with that portion of it, and then remove it from memory. Slow but doable for most things. Oh yeah, I have found that frequent calls to gc() help out enormously, regardless of what ?gc implies. And I'm constantly keeping an eye on the top unix function (not sure what the equivalent is in windoze) to check the RAM I'm taking up for a session. Best of luck! Matt On Nov 16, 2007 5:24 PM, sj <[EMAIL PROTECTED]> wrote: > All, > > I am working with a large data set (~ 450,000 rows by 34 columns) I am > trying to fit a regression model (I have tried to use several procedures psm > (Design package) lm, glm). However whenever I try to fit the model I get the > following error: > > > Error: cannot allocate vector of size 1.1 Gb > > Here are the specs of the machine and version of R I am using > > Windows Server 2003 R2 Enterprise x64 Service Pack 2 > > Intel Pentium D 3.00 Ghz > 3.93 GB Ram > > R 2.6.0 > > when I type the command > > memory.limit() > I get: > 3583.875 > > I assume that means that I have about 3.5 GB at my disposal so I am > confused why I can't allocate a vector of 1.1 GB. Any suggestions on what to > do. > > Best, > > Spencer > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Matthew C Keller Asst. Professor of Psychology University of Colorado at Boulder www.matthewckeller.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.